Everyday entropy: the lighthouse


What is the difference between the lighthouse and its light?  What Claude Shannon nailed down with his mathematical theory of information was that each bit of a message is the answer to a question, a reduction of the total uncertainty of the information system.  The anxiety of a writer in front of the blank page is not that it is free of information, but that the blank page contains every possible question that the language could possibly ask, and with each word the writer begins to whittle them down, until the page full of questions is reduced to a string of answers.  Each spoken word answers all but one of the possible questions as a no, with that word as a yes.  As a sentence evolves, the number of possible word choices dwindles or rises depending on what has come before.

You may protest that language is more nuanced than that, but you are probably thinking of the meaning you take from the message, not the information transmitted.  In essence, the Shannon measure of information is the number of yes/no questions you would need to ask in order to accurately reproduce any given message.  Shannon called it entropy because it is so similar to the entropy of a physical system, which is the number of microstates that could make up the observed macrostate.  But what I mean to discuss here is the essential difference between them: a yes/no question has only two possible outcomes, while physical entropy is fluid and infinitesimally variable.  This is the difference between the light and its house, the answer to a single question versus the massive conglomeration of microscopic particles stacked against the motion of the wind and waves.  Physical entropy is nuanced like the meaning you take from the words you hear.  There is no measure of meaning.

It is important to understand the difference between entropy and information because they define the difference between the physical universe on the one hand and the virtual universe of memory, dreams, language, books and the interconnected network of information technologies on the other hand.

You don’t have to understand the natural logarithm or its base e to imagine the difference between physical entropy and the Shannon measure of information.  Let us say you drop a cup of water on the floor, physical entropy looks like the probability than any given place on the floor will be wet. Places near the landing site are more likely to be wet than places further away, but the actual distribution is continuously variable, with an infinite number of possible combinations of wet and dry places.  Now imagine flipping a coin; the measure of information is the probability of one side or the other being up.  This result is also unpredictable, but it only has two possible outcomes (assuming we don’t care about the physical entropy of the coin, i.e., where it ended up.)  A flipped coin can make a decision or answer a question, a puddle of water can’t; and yet, whether the coin is heads up or down tells you nothing true about the coin itself, whereas the distribution of the puddle is the infinitesimally true position of the water.

What happens between the yes and the no of the nth bit?  The puddle’s boundary exists infinitesimally at all scales all the way down to the subatomic probability density distribution, but the coin toss is meaningless if it sits on its edge.  “Maybe” transmits no information.
The difference between the log base 2 of Shannon entropy and the natural logarithm base e (2.718…) of physical entropy seems quite small, but the shape of the curve they produce is different.  In particular, the slope at zero for the curve of the natural logarithm function is 1.  That is to say, it is perfectly balanced between up and over.  The log base 2 has a different slope here and everywhere else.  Because the slope determines the evolution of the system from one moment to the next, the difference between the slope of entropy and the slope of the measure of information cannot be emphasised enough.
Physical entropy is based on the natural logarithm instead of the base 2 log of information because, unlike binary data, a physical particle can go any which way at any speed, so the absolute freedom available to a real particle is not limited to up/down, right/left, front/back, or yes/no, but it can take on a blend of all of its degrees of freedom.  This fluid freedom is represented by e, which you may recall as the limit of compound interest.
If you are not familiar with e, imagine the water spill. If you press down and spread it one way maybe you could double the wetted area. Now start over. If you spread it half as far left, then it will cover 1.5 times it’s original cover. Now spread it downward to increase its cover again by half of 1.5.  The total increase will be 0.5 plus 0.75, so in total you’ll have 2.25 times the original spill. If you do the same with three spreads each yielding a 33% increase from each spread, you end up with 2.36 times the original. Four spreads yield 2.44, and twelve spreads yield 2.61. The limit of the incrementally smaller but more frequent spreading is e, about 2.781, and it applies to every fluid phenomenon where materials have freedom to move in more than one direction. It is the limit of distribution.
The probability density for any given particle averaged continuously over the total range of freedom ends up looking just like the spread paint.  (This may be better represented by Bernoulli trials, but the result is the same.  e is a strange and beautiful idea that is not quite a number and shows up almost everywhere).   Information uses the log base 2 (binary log) because information in transmission can only be the answer to a string of yes/no questions, or bits with only two possibilities (on/off, yes/no, open/closed), and it can only be transmitted in whole units.  Even vision and hearing are based on inputs from individual nerve cells capable of only one yes/no answer to the question, were you triggered?  So, while physical entropy can increase infinitesimally through the evolution of the system, information can only increase in bits – whole quantised units.  No matter how many bits you use to describe a system, that last bit will miss the infinitesimal variability of true entropy.
In normal conversation, we mostly mix and match abstract and real information in a continuous call and response, and usually the distinction doesn’t matter.  The abstract information about a bowling ball and the true information of the ball are pretty similar and feel the same unless the bowling ball hits you in the face.  At which point the two informations feel totally different.  That seems obvious, but this happens all the time.  People build up elaborate systems of abstract information that are perfectly known and under control until reality breaks down the fourth wall and dumps a big pile of shit in their laps.
But why does the shape of the entropies matter?  Physical reactions produce infinitesimal transformations in themselves and their surroundings that cannot be represented in the incremental changes in an information system.   The longer an information system exists in isolation from the physical system it is trying to represent or manage, the further the transformations will diverge from one another.  This is why science will never be settled or finished, no matter how complete its knowledge becomes.  Some part of the confusion we experience as sentient beings may come from the slight difference between the form of the information bouncing around in our heads and the form of true information in objects.  Obviously, this is speculative.

The difference between the measure of information and the entropy of matter in space has a physical analog in the difference between photons and massive particles like electrons and protons.  A photon, like a yes/no question, is always either emitted or not emitted by the transmitter and it is always either absorbed or not absorbed by the receiver.  A photon can only be a binary phenomenon, and its existence is measurable without uncertainty.  It is analogous to a unit of information that is either transmitted or not, and upon receipt it ceases to exist.  Electrons don’t carry around a bag of photons that they fill and empty.  The photon only exists in transit, so it cannot be captured or held for observation.

On the other hand, a massive particle has infinitesimally variable freedom.  Its momentum and location can be approximated through the application of constraints, but in order to know its location you must apply constraints that make its momentum uncertain, and vice-versa.  Its momentum and location are real, but Heisenberg uncertainty means that you can never know both of them simultaneously.  So you can capture and observe certain aspects of a massive particle, but you cannot know for certain what you’ve captured without changing it into something else.

We usually deal with such large agglomerations of massive particles that the average location and momentum provides a really good “picture” of the group, whether it’s a bowling ball or a cloud of steam, and we tend to think of this average as both ontologically real and epistemologically knowable at the same time.  But there is a difference between this mashup of averages of location, momentum, and visible light that makes up our understanding of the bowling ball on the one hand, and the two distinct concepts of physical entropy (momentum and location), and the Shannon measure of information in transmission (electromagnetism).

It is tempting to ask why information entropy doesn’t just use e as well for consistency, but it can’t. An infinitesimal can’t exist in information. The smallest unit of information that can be transmitted is one bit or one photon. No fractions. No incremental adjustments. Anything less than one bit is simply unclear. Information has no slope at zero and no integration from zero to one. But the difference in physical and informational entropic extends through to infinity. The slope is always different. What this means is that no information system can maintain an accurate model of physical entropy for more than a moment without corrections. We can communicate the concept of entropy with information, but not model it. This is why nobody believed the climate models until the ice actually started melting off the poles, and why computer animation is clunky unless it’s based on motion-capture from real people.  Energy can equally take the form of momentum or volume, and entropy is not biased one way or the other – which is why fluid pressure drops when the particles accelerate (over an airplane wing, for example), the acceleration energy takes energy away from its volume, thereby reducing its pressure.   Moreover, while e can be approximated to a large number of digits, it can never be calculated precisely.  So, you could use all of the power in a supercomputer calculating e for a few parts of your physical model and still not have it right.

And what about the wavelength of the photon, which is continuously variable? No information can be transmitted unless the transmitter and receiver are resonant, at least harmonically, and all harmonics are whole numbers. Radio receivers avoid being overwhelmed by all of the random radiation in the world because their reception is tuned to the wavelength of a known transmission signal. You can modulate wavelength to move the signal into or out of resonance with a given receiver, and any given device may be tuneable or have many receivers, like the eye or the ear, but each receiver still only gets one bit of information when the transmission frequency is in resonance.
Resonance and harmony give physical things the appearance of information, because all resonance is unitary and all harmonics are based on whole numbers.  Physical resonance can be remembered as information, stored in another form, but the resonance itself, along with all of its harmonies, will decay entropically without continual maintenance.  Even the greatest pendulums must be reset occasionally.  But resonance and entropy bind information to physical reality in a way that is not at all trivial, and they are indistinguishable at the boundary between infinitesimal freedom and the first harmonic.
You may ask why we can’t just plug e into the information systems and chug away, but you have to remember that e is an irrational number.  For a continuously evolving system, it is not enough to insert the symbol e and then solve for it at the end of the process.  Each evolution, that is, every part at every moment, needs e to be fully resolved for the calculation of its entropy.  But e has an infinite number of digits.  You can cut corners and only resolve e down to, say, ten digits, but this introduces an error, which grows in every moment.  No matter how fast your processors and how big your memory, you cannot calculate the evolution of reality.  This is why it is so important to listen to people with whom you disagree.  Even if they are curmudgeons.
This is why taxonomies of real things are never quite right, either in describing the differences between real things or their relationships.  Physical things don’t relate to one another in the same way that informational bits relate to one another.  Boolean algebra works for the information, not for the reality.  This is also why artificial intelligence is more annoying than helpful (although it appears to work well in an information system isolated from reality, like, say, finance, but these systems are prone to sudden, catastrophic corrections when the insulation breaks down).  The challenge of intelligence is not more and better calculations.  The challenge, now that we have some sense of information awareness, is understanding the inherent differences between the measure of information and the entropy of physical reality.

The point is not that science is at a dead end, but that science is never finished.  No matter how objective and perfect the axioms and algorithms, the information that can be transmitted deviates continuously from reality.  Every information system, whether artificial or human, is in a constant process of going mad.  This is why the debacle of Microsoft’s Tay AI chatbot was so predictable, and why social media hasn’t engendered an informed public in the middle east or in the United States. Ordinary madness requires continuous correction, while pathological madness can only be treated as a natural affliction of reality.  The best laid plans of mice and men have a different shape and evolution from the reality they confront.



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s