GullyFoyle

06-01-2006, 02:32 PM

Few people understand the concept of randomness and percentages in statistics.

For example:

Some people complain that iTunes is not random on shuffle, it plays certain groups more often, even though Apple has stated multiple times that it is indeed random.

If you play online games like World of Warcraft you always get somebody in a group that says a roll is not random because the same number shows up twice or three times in a row.

What most people don't realize is that random does not mean that you wont be able to find a pattern in it. It also does not mean that all the given outcomes are equally distributed. This was demonstrated on a 60 minutes show a couple of years ago. They had a statistics professor come on the show and talk about his basic stats class. In the class he asked all the students to either

A) flip a coin 50 times and mark down the results (heads or tails) each time

or

B) make it up and don't actually flip the coin

He could always tell the difference between the two groups right away because the people who actually flipped would end up with stretches of 4-6 heads or tails in a row while the people making it up wouldn't. People in general don't realize that long stretches of the same outcome is very likely in 50 flips. What people expect from randomness is a nice even distribution of heads and tails (or songs or hits).

I flipped a coin 50 times and this is what I got:

HHHTTTHHHHTHHTTTTHTHHHHTHHTHTHTTTHTHTTTTTTHHHTTHTT

24 heads, 26 tails and a series of 6 tails in a row and multiple series of 4 in a row

Now lets take this principle to baseball. Lets say I have a .333 avg hitter. He gets a hit 1 out of every 3 times. Now lets take a 12 sides die (because there are no three sided die and yes I'm an old D+D geek). If I roll 1-4, its a hit and 5-12 it is not and I'll roll it 50 times.

Here is what I got (H=hit, O=out)

OOOHHHOOHOHHHOOHOOOOOOOHOOOOOOHOOHOOHOOHOOHOOHHOHO

17 hits in 50 rolls (slightly better than average), but also a streak of 14 rolls with only one hit!

Now imagine what people would have said about this hitter during that 1 in 14 at bat span. You would have seen all sorts of threads taking apart his stance, swing, who is hitting in front and behind, he needs a day off, change positions, swing more, swing less, etc.

Guess what... a .333 will go through periods of not getting a lot of hits and this is just in 50 at bats.

This is why you can't take too much from observation and small sample sizes. The mind wants to find patterns in everything it experiences. It is not well equipped to see randomness.

People expect that if something is random then it means that it is well distributed and this is just not the case. So as our favorite team gets hot and cold remember that even the best hitters will go long times without hits and the best teams can go long stretches without winning (a .500 team is the same as the coin toss above).

(BTW, The purpose of the 60 minutes show was to demonstrate how the IRS catches cheats because people are bad at making up random numbers on their tax returns.)

Hope people found this interesting... back to what I should be doing... :)

