Forget The Cluetrain, Return To Seminal Works Of Information Theory Of The 40s

Posts by Umair Haque, Fred Wilson, and Jeff Nolan are all talking about the looming attention crisis (see here & here) related to information overload and the blogosphere. Wilson captures Umair’s words:

Herbert Simon said it in 1971, which is that "What does an abundance of information create?" A scarcity of attention basically, right?

It’s funny that blogging is starting to replace email conversations (based on ROI), yet we are still in overload. I, for one, peaked out at reading between 50 and 100 feeds, and I needed to start to drop some and add others as my work balance shifted.

For those may not know, blogging was heavily influenced the seminal blogging work, "The Cluetrain Manifesto". The takeaway saying from that work has been "markets are conversations".

But I think with the doubling of the blogosphere hand over fist (with no end in sight on the graphical charts) we are starting to see some limits as to what people can process.

That is why I think we will see a return to works that preceded the Cluetrain by 50-60 some years in the 1940s. These are the seminal works of Claude Shannon on information theory, and much of his work was written in plain English without having to be decorated with tons of mathematics. We may find inspiration in solving some of our new problems from the information theory and related fields.

Some key items from the field of information theory:

  • The essence of information can only be reduced so far before the content is distorted. This concept became known as the entropy bound. Entropy can be thought of as an energy-level contained within a message.
  • Smart guys like Huffman (at Stanford when he did his master’s thesis if I recall correctly) developed algorithms for organizing information so that it could be reduced using probabilities. More frequently occuring information or data patterns (i.e., higher probability items) would be encoded using short patterns (likely shorter than their original length). Less frequently occuring data patterns would be encoded using longer patterns, perhaps even longer than their original length. The net effect of structuring things in total around a probability-oriented tree was to increase efficiency of information transfer (e.g., compress information sent over a modem).
  • A channel (such as a data pipe into the home, or perhaps likened to that of a blog reader’s maximum ability to process information) had to have as much capacity as the entropy.
  • No message could be compressed beyond the entropy bound or it would be distorted (i.e., data lost).
  • But a new field grew out of this area, called rate distortion theory. Rate distortion theory enabled people to compress (and hence, process) information even better. The side effects were that one needed to control where distortion occurred. For example, much of the music contained on an iPod is not an exact reproduction of the bits and bytes on an original compact disk. At risk of trivializing the process of getting that information onto an iPod, data and information has been thrown out or trimmed from the song. That said, the audio folks try to use knowledge of the ear, hearing, music reproduction, etc. to shape the noise and distortion into an area where people don’t care (e.g., in some audio processing systems, perhaps where the average person can’t hear imperfections, such as above 15kHz frequency where certain cymbal overtones occur).

The blogosphere faces similar challenges in terms of capacity limitations, means for improving efficiency, and information distortion.

Feedreaders can speed the process of reading blogs. I’m guessing my efficiency in reading news went up 50%+ after shifting to a feedreader.

At some point though, I ran out of channel bandwidth using this method. What to do then?

Well as folks like Andrew have pointed out (as does Jeff Nolan in his post), people do a lot of linking to the same posts. Good case for unsubscribing to A-list blogs because you get the same information everywhere. But I suppose that the unsubscribing method for reducing information may create some distortion. One is not getting info straight from the horse’s mouth so to speak.

Other methods, such as using web services to aggregate tags (as recently contracted by Nivi) are also a way to reduce information. But I suppose that this method could create distortion in that one is seeking information that tends to be the same as what you’ve always sought out in the past.

Another area of interest is around blog communities (as I mentioned before here in the context of BusinessWeek’s b-school community developed by 21Publish), and how the dialogue surrounding these types of structures seems different to me.

I do not know the answer to the general question of how to reduce information overload most effectively (I have resorted to dropping feeds and adding new ones). But I will say that I suspect the blogosphere (and the web in general) does not pay enough attention to holes and gaps in information (e.g., can we create maps of worlds readers are missing). We also tend not to pay enough attention to distortions created by circular linking, information reduction, reinforcing lists, etc. Perhaps not mainstream concerns in the blogosphere, but I know that these types of pitfalls have to be avoided in business in general. Why should use of the blogosphere be any different?

Technorati Tags: , , , ,

Update (11/1/05): Bill Burnham has a post on Feed Overload Syndrome.