On Statistics

I work with statisticians all the time, and most of them are truly brilliant. But very often they tend to trust statistics way too much, and tunnel their focus in such a way that they fail to see the bigger picture. The more they specialize in their field, they more they tend to overvalue their equations and formulas, often to their detriment.

Statistics is like an open highway, which can take you quickly and easily to all the right destinations you might have in mind. But it can just as easily take you to all the wrong ones, far far out of your way. All you have to do is take the wrong exit, and you'll end up in the wrong place. It's not like the highway is going to warn you! You're the driver, and it is completely up to you to decide which exit to take and which to skip, in order to arrive where you're heading. So it is with statistics. The formulas are all there, ready to be used all the time. The formula for the mean, or the variance, or whatever, will always work, in that it will always spit out a number. But just because you got handed an output doesn't necessarily mean this output is useful, or even meaningful.

This simple fact, is so often overlooked by otherwise very smart people, to the point where I decided to make a list of eye-opening examples where correct statistics lead you to incorrect conclusions. So, without further ado:

My growing list of examples where measures of central tendency are misleading:

  1. If you didn't know how to swim, and I told you this wide river ahead of you is only knee-deep on average, would you cross? You'd be stupid to cross! The average in this case, is a completely meaningless measure. What I mean by 'meaningless' here is that it has no bearing on the matter at hand. What you really need to know is the maximum depth, not the average depth. Therein lies the difference between life and death, and therefore the difference between meaningful and meaningless.
  2. The median atom in this universe is a Hydrogen atom. That is to say, on average, the universe is just a bunch of Hydrogen atoms sprinkled across vast amounts of space. That is an accurate description of the universe, but what a boring account it is! Hiding all that intricate detail of elegant materials, crystals, rocks, metals, geological formations, living tissues and elaborate chemistry taking place in very small pockets of the universe, such as this tiny planet we inhabit. By using average statistics here, an observer would totally miss the most interesting parts of the universe (the extreme outliers) and narrow his focus on the boring, meaningless, lifeless mass.
  3. An alien statistician, trying to report on life on Earth, would correctly observe that the mode living organism on earth is simple one-cell bacteria swimming on the surface of the ocean, trapping sunlight and manufacturing energy. That would be an indisputable conclusion, as there are many more of these living forms that of all others combined, but what a colossal understatement of the marvelous diversity of life on earth, complete with elaborate creatures swimming, jumping, gliding, galloping, snaking, flying, and burrowing their way through their corresponding habitats! To appreciate the sheer complexity and diversity of the phenomenon of life on Earth, one needs to ignore the mode, and jump directly to the smallest minority of lifeforms, living on the outer fringe of the overall statistical distribution.

Comments