Information Creation

Introduction

In What is Information, I wrote about the definition of information, how the concept of information has been used historically, and introduced the information lifecycle. The post launched my Information Lifecycle series, which explores each stage of the information lifecycle, discusses information disorder throughout the lifecycle, and aims to increase information literacy for readers.

Information literacy includes the ability to distinguish fact from fiction.

This post is dedicated to the first part of the information lifecycle: information creation.

The Information Lifecycle
Source: Information, A Very Short Introduction, by Luciano Floridi

What to expect

This post will cover the following questions:

  • Is information created or discovered?
  • When is it created?
  • How is it created?
  • Why is it created?
  • Who creates it?
  • What happens when false information is created?

So many questions.

Is information created or discovered?

Reminder: well-formed and meaningful data + meaning = information.

Data can be discovered, generated, or observed. But because meaning requires interpretation of the data, information is created. The data is discovered, meaning is assigned to the data, and information is born.

Example – a biologist is studying the larvae of a fruit fly for cross-generational genetic abnormalities. They uncover abnormalities and some sort of pattern therein. What did they create and what did they discover?

Because a datum is a fact regarding some difference or lack of uniformity in a context, the biologist has discovered it. In this case, the difference between a typical genome and an atypical genome.

Then, they assign meaning to the data through their analysis of the pattern. Maybe the abnormality only appears in 8% of the fruit fly population and only in females, which has certain implications for the field and scientific community.

Information has been created. The biologist discovered data and assigned meaning to it, thereby creating information.

To carry this example through the rest of the information lifecycle: this information may then be recorded in a specific format, like an academic paper, processed by peer review, distributed via a scientific journal, consumed by other scientists, and preserved in perpetuity online or in physical libraries.

When is information created?

Hold up. Let’s break this into two questions:

  1. Can information be created more than once?
  2. Can data and meaning be discovered/created totally separately?

Can information created be created more than once?

It could be argued that information is only created once. However, it is very difficult, if not impossible, to know two things:

  1. When that occurs – when was the prototypal instance of that information created?
  2. If the meaning assigned to the data is the same meaning assigned to the same (or similar) data discovered by someone else.

I land somewhere in the grey area between yes and no. I think information is generally a one-time occurrence in the scope of all human knowledge, and it also occurs whenever a single human brain creates information.

“Because neither ‘memory banks’ nor ‘representations’ of stimuli exist in the brain, and because all that is required for us to function in the world is for the brain to change in an orderly way as a result of our experiences, there is no reason to believe that any two of us are changed the same way by the same experience.”

Source

So as individuals, we may perfectly well come away from experiencing the same data (or at least very similar data) and assign entirely different meanings to it.

Can data and meaning be discovered/created totally separately?

Information creation is an asynchronous process in that data can be observed or discovered, and then meaning can be assigned later on.

In fact, if data is able to be observed during an event and there is no one or nothing there to observe it to assign meaning, no information will be created. If a tree falls in the woods and no one is there to observe it, how do we know that trees can fall? However, if I see a tree falling in the woods, I can assign a meaning to it – “trees fall sometimes, I need to be careful when in the woods.”

Additionally, if data alone or information itself has been recorded or stored, it is possible to derive meaning from it far after the date of creation. Perhaps in some cases, the meaning has been lost and is rediscovered. For example:

In 1972, Donald Knuth, an early computer scientist at Stanford, looked at the remains of an Old Babylonian tablet the size of a paperback book, half lying in the British Museum in London, one-fourth in the Staatliche Museen in Berlin, and the rest missing, and saw what he could only describe, anachronistically, as an algorithm…. In the Louvre he [had] found a “procedure” that reminded him of a stack program on a Burroughs B5500. “We can commend the Babylonians for developing a nice way to explain an algorithm by example as the algorithm itself was being defined,” said Knuth. By then he himself was engrossed in the project of defining and explaining the algorithm; he was amazed by what he found on the ancient tablets. The scribes wrote instructions for placing numbers in certain locations—for making “copies” of a number, and for keeping a number “in your head.” This idea, of abstract quantities occupying abstract places, would not come back to life till much later.

Gleick, James. The Information: A History, a Theory, a Flood (pp. 52-53). Knopf Doubleday Publishing Group. Kindle Edition.

Knuth rediscovered information from a historical artifact. This happened asynchronously from the event of the tablet’s creation; that is, he did not need to be present during the time of the tablet’s creation, to derive meaning from it.

Double take.

How is information created?

Once again, the General Definition of Information (GDI) says that something is considered information if it consists of one or more instances of data and if that data is well-formed and meaningful.

When individuals learn something and create information, it typically happens in three ways: direct experience, interaction, observation.

For groups or institutions, there are many ways to create information, most of which start with collecting data, trying to answer a research question, or testing a hypothesis. This can all occur through direct experience, interaction, or observation.

For individuals, groups, or institutions, intent plays an important role in assigning meaning to data. It can happen intentionally or intentionally (with positive intent or negative intent).

  • Unintentional: Humans unintentionally assign meaning to data all the time. We experience an event and draw a conclusion without thinking about it. These conclusions are then abstracted and inherited up into our beliefs, biases, prejudices, etc. We use these cognitive constructs to navigate the world around us moving forward. For example, a child burns their hand on a stove once. From that experience, they learn to be careful (hopefully) around stoves.
  • Intentional: the simplest and most fitting example of an intentional process of assigning data to meaning is the scientific method. Scientists want to find the kernal of truth and bring that to the world – their intent is positive. When the intent is negative, we often land with information disorder. More below on that.

Individuals

Direct experience

Direct experience is one of the most fundamental ways humans create information. A human experiences an event; their brain synthesizes the sensory input – touch, smell, sight, sound, taste – from an event (observes the data); organizes it into some sort of perception of the event and assigns meaning along the way; and the experience may or may not be stored in memory in some capacity.

Slight digression – the brain is not a computer.

Freaked out llama gif

The information processing (IP) metaphor of human intelligence now dominates human thinking, both on the street and in the sciences…But the IP metaphor is, after all, just another metaphor – a story we tell to make sense of something we don’t actually understand… It encumbers our thinking with language and ideas that are so powerful we have trouble thinking around them.

Source

If our brains were computers, then we’d be able to remember everything we’ve ever seen in perfect detail. As it were, most of us can’t and don’t.

Interaction

Interaction is the second way humans create information.

Human A tells Human B to not eat that specific mushroom. For Human B, this is new information; maybe Human B then tells Human A not to eat a few different (other) kinds of mushrooms. Human A and Human B discuss how their trust in mushrooms is truly broken and they’ll be more careful when eating mushrooms in the future. Human A and Human B have just learned new information by interacting with each other.

Observation

Observation is the third way we create information. Data can be observable or not observable. It is observable if beings capable of creating information/learning are able to observe it. Any observable event can be used to create information, misinformation, or disinformation.

There are things we can’t observe (yet), like the center of a black hole. We can assume there is data to collect there, but it is not observable. Observability intersects with location – where is the data? The location can be physically tangible (person, place, or thing), or non-physical (online).

An absurd example of information creation via observation: a human observes Fox A and Fox B in the forest. Fox A eats a mushroom. Fox A starts talking to Fox B. Fox B does not understand, so Fox A urges it to eat a mushroom. Fox B eats the mushroom, gains the ability to speak, and then they have a conversation. The observed data “these two foxes have eaten this mushroom and are now speaking” and the meaning the human assigns to the event “this mushroom gives foxes the ability to speak” together make information.

"What did the fox say" gif

Groups/Institutions

How information is created in groups or institutions differs from the ways individuals create information. It’s best to split up this topic by touching on data collection first and meaning second.

Data

There are multiple data collection techniques that can be used, either by themselves or all together, collect qualitative or quantitative data:

  • Interviews
  • Observations
  • Questionnaires/surveys/polls

Depending on the technique, data is collected from people by people through direct interaction or through the use of technology (supervised to various degrees, e.g. an online survey -> web scraping). Depending on the tools and level of supervision, the use of technology can be a mixture of interaction, observation, and direct experience.

Improperly collected data can be very harmful, and can contribute to information disorder. Learn more the rigorous process of data collection in a research context.

Depending on the domain, the way data is collected may be subject to specific rules and regulations. For example, data about human subjects, regardless of the institution collecting the data (public or private sector), may be subject to certain restrictions set forth by the Office for Human Research Protections, such as the Health Insurance Portability and Accountability Act of 1996 (HIPAA), or other, depending on the country where the data is being collected. Additionally, people who collect or interact with certain data may also have to undergo special training to learn to handle the data.

Meaning

Assigning meaning to data intentionally can be incredibly difficult, as scientists and researchers are well aware.

If this process is not carefully and judiciously executed, there can be tragic consequences, as there were in the Vietnam War body count controversy.

During the Vietnam War, the United States Armed Forces required metrics to measure the progress of military operations in Vietnam. Body count was one of the primary metrics chosen to indicate success of the war. Due to the focus on this data point and the meaning applied to it (higher body count -> greater success), killed civilians were counted as enemies; the success of battles were determined by body count ratio rather than tactical indicators; and body counts were inflated left and right.

General Westmoreland’s strategy of attrition also had an important effect on our behavior. Our mission was not to win terrain or seize positions, but simply to kill: to kill communists and as many of them as possible…. Victory was a high body-count, defeat a low kill-ratio, war a matter of arithmetic. The pressure on unit commanders to produce enemy corpses was intense, and they in turn communicated it to their troops. This led to such practices as counting civilians as Viet Cong. “If it’s dead and Vietnamese, it’s VC,” was our rule of thumb in the bush. It is not surprising, therefore, that some men acquired a contempt for human life and predilection for taking it.

Source

When the underlying data is not meaningful or well-formed or the meaning assigned to that data is incorrect, then this becomes misinformation. In the Vietname body count controversy, there were problems with the data and the assigned meaning of the data. The end result? Misinformation used on a grand scale to justify continued military operations in Vietnam.

While the stakes are not usually this high, misinformation poses huge risks to consumers of the information, as illustrated in this example.

Story points, story points, story points. Over time, as Agile story points have gained popularity, they have been used occasionally as indicators of a development team’s success. The higher quantity of story points that are fulfilled in a sprint, the greater the success of the team! What’s the problem? Well, why not just inflate your story point estimates?

In this example, the underlying data slowly becomes less meaningful and well-formed due to story point inflation. The meaning assigned to the story points created incentives that ended up undermining the data. This, too, becomes misinformation.

TLDR; how is information created?

For individuals, the events we experience through direct experience, interaction, or observation become information with which we navigate future experiences. For groups and institutions, it’s a bit more complicated.

The scientific method is how new information is created in the sciences:

Scientists make progress by using the scientific method, a process of checking conclusions against nature. After observing something, a scientist tries to explain what has been seen.

The explanation is called a hypothesis. There is always at least one alternative hypothesis.

A part of nature is tested in a “controlled experiment” to see if the explanation matches reality. A controlled experiment is one in which all treatments are identical except that some are exposed to the hypothetical cause and some are not. Any differences in the way the treatments behave is attributed to the presence and lack of the cause.

If the results of the experiment are consistent with the hypothesis, there is evidence to support the hypothesis. If the two do not match, the scientist seeks an alternative explanation and redesigns the experiment.

When enough evidence accumulates, the understanding of this natural phenomenon is considered a scientific theory. A scientific theory persists until additional evidence causes it to be revised.

Nature’s reality is always the final judge of a scientific theory.

Source

Why is information created?

It depends.

For an individual, just like any other human behavior, the act of information creation can be intentional or unintentional, and falls under the broader question of why people do what they do. For example, from the perspective of self-determination theory, we behave in a certain way because of intrinsic or extrinsic motivations that do or do not fulfill three universal psychological needs: autonomy, relatedness, and competence. Behavioralism has a different set of explanations. So on and so forth.

With groups or institutions, the reasons get information creation get more variable and complicated, as individuals within the collective may have different reasons for the information creation. Plus, the motivations for collecting data may differ from the motivations for assigning meaning to that data.

Regardless of if it’s an individual, group ,or institution creating information, a common motivation for doing so is that the information may have value.

  • I can totally make money on this!
  • I believe x is true and if I can prove it, glory and fame will be mine!

Value. What a loaded and overused word. Books have been written about value, so we’ll keep it simple and stick to this definition:

The regard that something is held to deserve; the importance, worth, or usefulness of something.

Value is the monetary, material, or assessed worth of an asset, good, or service. “Value” is attached to a myriad of concepts including shareholder value, the value of a firm, fair value, and market value.

Source

For our purposes, the asset or good is data (or information).

Sometimes data is collected without a preconceived idea of what the meaning will be, or the data is collected and then meaning is assigned to that data because of other drivers (political, financial, etc). Maybe we collect because – you know – we can, so why not?

Meaning is assigned to data. This is a highly interpretive process and fraught with risks, as I already touched on. Sometimes, data is valuable with little to no assigned meaning; the fact that the data even exists has inherent meaning or value. This exists, ergo I can use it to make money! Other times, the meaning is the valuable part, so there can be a lot of pressure to justify the effort of collecting data by assigning meaning to it.

Who creates information?

We live in the information age. A more metaphysical interpretation of our current reality posits that we live in an infosphere, similar to our physical reality (the biosphere).

In many respects, we are not standalone entities, but rather interconnected informational organisms or inforgs, sharing with biological agents and engineered artefacts a global environment ultimately made of information, the infosphere. This is the informational environment constituted by all informational processes, services, and entities, thus including informational agents as well as their properties, interactions, and mutual relations.

Source: Floridi, Luciano. Information: A Very Short Introduction (Very Short Introductions) (p. 9). OUP Oxford. Kindle Edition.

Simply, there is a lot of information floating around these days and we’re all steeped in it.

So who’s creating all this stuff?

Any person, group, organization, or country can create information. But consider these concepts:

  • Access: Sometimes we know when information is out there, sometimes we don’t. Ahem, CIA.
  • Source: Sometimes it’s clear where the information was distributed from. Sometimes not. Ahem, viral social media posts.
  • Authorship: Sometimes the creator claims authorship, sometimes they don’t. Ahem, 4chan.
  • Authority: Sometimes the information is presented with authority (we know this is true!), sometimes without authority (maybe this is true). Ahem, politicians presenting falsehoods as truth and the facts don’t matter trend.
  • Credibility: Sometimes information is distributed via a well-respected media outlet or journal with a long-standing, positive reputation. Sometimes not. Ahem, viral social media posts.

Lots to consider.

Individuals

We, as human beings who exist (probably… 🥄) and observe the world around us via our five senses.

We observe intentionally & unintentionally, subconsciously, or consciously, with high or low levels of effort. Through memory and introspection, we may re-examine our perception of events and derive new meaning from them. Because our memories are fallible, it is also possible to fabricate new data during the process of recalling the memory. Source.

Maybe that cake wasn’t as delicious as you think it was…

Last time you felt like sh*t after eating it…

…remember?

Groups

We, as human beings who exist (probably… 🥄) with one another also collect data as a group. By group, I mean a set of individuals that are related to each other or live in close proximity – like a family (nuclear or extended) or a church or a small town or an apartment building.

Through our social processes, we learn what it means to function as a group within society. Part of these social mores and morays involve sharing data with one another (e.g. hey, George – x is on sale at the grocery store right now). You are sharing this data because you know that George really likes x (the meaning). You are sharing this information with George because you know that George likes x and would be interested in knowing this – it’s a prosocial behavior. So then George goes to the grocery store to purchase x.

Let’s make x pickles. Because why not. Pickles are kind of a big dill.

Institutions

Moving on.

No, just kidding, let’s stop here for a bit.

Google collects copious amounts of data and assigns meaning to that data. Who allows them to do it – who granted them the authority? No one. They did it because they wanted to, they could, and the data had tons of value.

The Pew Research Center collects polling data, the results of which are assigned meaning. However, as with data collection techniques of all kinds, the methodology for collecting the data can impact the quality of the data and undermine the value of the meaning. E.g. in a political poll, the way questions are asked can influence how recipients respond to them.

So… anyone can create information?

Shocked kitten gif

Yeah… so about that.

Revisiting the definition of information: data + meaning = information. If the underlying data is true/accurate, well-formed, and meaningful, then it’s information. As you can imagine, creating new information is actually really hard because it has to fulfill these conditions.

But argh, so many questions!

  • What if something is wrong with data or the meaning assigned to the data?
  • What if there’s some false data?
  • Or if the data isn’t well-formed?
  • Or it’s not meaningful? Or has the wrong meaning? Or doesn’t have any meaning?

Then we have information disorder on our hands.

World War Z gif

Scary, right?

Even though we don’t have any zombies climbing walls, the effects of information disorder are terrifying. There’s never been a more poignant time to talk about information disorder.

What is information disorder?

Let’s be clear – information disorder is nothing new. Between political smear campaigns since the dawn of, uh, politics, to anytime anyone has spread a false rumor, we’ve been dealing with information disorder.

However, in the information age, we’re dealing with it on an unprecedented global and personal scale.

According to the Council of Europe’s Information Disorder Report of November 2017, which attempts to “examine information disorder and its related challenges,” there are three types of information disorder:

  • Disinformation
  • Misinformation
  • Malinformation

The introduction to the 2017 report argues that “while the historical impact of rumours and fabricated content have been well documented… contemporary social technology means that we are witnessing something new: information pollution at a global scale; a complex web of motivations for creating, disseminating and consuming these ‘polluted’ messages; a myriad of content types and techniques for amplifying content; innumerable platforms hosting and reproducing this content; and breakneck speeds of communication between trusted peers.”

A summary of this quote and a few additional notes:

  • Information disorder is created for a number of different reasons (outside the scope of this piece, read the Council of Europe’s report if you wish to learn more)
  • Information disorder can be created by official or unofficial actors
  • Information is typically packaged into the form of messages, which are then transmitted across complex networks at great speeds
  • This is happening at a greater scale and rate than we’ve ever seen before

Misinformation

Misinformation is false or inaccurate information that is created unintentionally.

I argue that misinformation is information that contains a) incorrect or inaccurate data, or b) not well-formed data, or c) data that either has no meaning, over assumed meaning or is assigned the wrong meaning.

While I can’t quantify this, my gut is that a substantial amount of what starts out as information becomes misinformation due to new data or a new way of observing, collecting, or deriving meaning from the data. We think we know something because we interpret some data we found and use it to create a theory about how something works. We think it’s right. In light of new data, 10 years later, it turns out we were wrong and our information becomes false or inaccurate – now it is misinformation. The scientific process has built-in structures that help account for this. Additional research is built on what came before, and things have to be confirmed and reconfirmed over the course of future research. The self-referential and repetitious nature of this process ideally helps reinforce the information that is more likely to be true, and debunk information that is more likely to be false.

Old wives’ tales are a good example of misinformation, if we assume that at some point somebody thought they were true.

(I was told all of these at some point or another as a child.)

When this information was created, the creator thought it was true. Like I said, it happens, and frequently.

Disinformation

Disinformation is false or inaccurate information that is created with the intent to cause harm.

A simple example of this would be a false rumor.

Mean Girls gif

More serious examples include:

The harm inflicted by misinformation can be anywhere from mild to catastrophic; short-term, to long-term.

Malinformation

Malinformation sounds like what it is. It is information that may be true or accurate and is strategically used to inflict harm on a person, group, business, or country. While the 2017 report classifies this as a type of information disorder and then doesn’t dive too deep into any type of analysis on this category, I consider malinformation more as weaponized information.

Examples:

  • Leaking evidence of a political opponent’s extra-marital affairs to a media outlet
  • Disclosing the movement of military units to enemy intelligence

TLDR; what is information disorder?

Fine. Here’s some pictures.

Types of Information Disorder
Credit: Claire Wardle & Hossein Derakshan, 2017, Link to Creative Commons License
7 Common Forms of Information Disorder
Credit: Claire Wardle & Hossein Derakshan, 2017, Link to Creative Commons License

Resources