Few people understand the concept of randomness and percentages in statistics.

For example:

Some people complain that iTunes is not random on shuffle, it plays certain groups more often, even though Apple has stated multiple times that it is indeed random.

If you play online games like World of Warcraft you always get somebody in a group that says a roll is not random because the same number shows up twice or three times in a row.

What most people don't realize is that random does not mean that you wont be able to find a pattern in it. It also does not mean that all the given outcomes are equally distributed. This was demonstrated on a 60 minutes show a couple of years ago. They had a statistics professor come on the show and talk about his basic stats class. In the class he asked all the students to either

A) flip a coin 50 times and mark down the results (heads or tails) each time

or

B) make it up and don't actually flip the coin

He could always tell the difference between the two groups right away because the people who actually flipped would end up with stretches of 4-6 heads or tails in a row while the people making it up wouldn't. People in general don't realize that long stretches of the same outcome is very likely in 50 flips. What people expect from randomness is a nice even distribution of heads and tails (or songs or hits).

I flipped a coin 50 times and this is what I got:

HHHTTTHHHHTHHTTTTHTHHHHTHHTHTHTTTHTHTTTTTTHHHTTHTT

24 heads, 26 tails and a series of 6 tails in a row and multiple series of 4 in a row

Now lets take this principle to baseball. Lets say I have a .333 avg hitter. He gets a hit 1 out of every 3 times. Now lets take a 12 sides die (because there are no three sided die and yes I'm an old D+D geek). If I roll 1-4, its a hit and 5-12 it is not and I'll roll it 50 times.

Here is what I got (H=hit, O=out)

OOOHHHOOHOHHHOOHOOOOOOOHOOOOOOHOOHOOHOOHOOHOOHHOHO

17 hits in 50 rolls (slightly better than average), but also a streak of 14 rolls with only one hit!

Now imagine what people would have said about this hitter during that 1 in 14 at bat span. You would have seen all sorts of threads taking apart his stance, swing, who is hitting in front and behind, he needs a day off, change positions, swing more, swing less, etc.

Guess what... a .333 will go through periods of not getting a lot of hits and this is just in 50 at bats.

This is why you can't take too much from observation and small sample sizes. The mind wants to find patterns in everything it experiences. It is not well equipped to see randomness.

People expect that if something is random then it means that it is well distributed and this is just not the case. So as our favorite team gets hot and cold remember that even the best hitters will go long times without hits and the best teams can go long stretches without winning (a .500 team is the same as the coin toss above).

(BTW, The purpose of the 60 minutes show was to demonstrate how the IRS catches cheats because people are bad at making up random numbers on their tax returns.)

Hope people found this interesting... back to what I should be doing...

3. ## Re: Ups and Downs of Randomness, Observation, Stats and Baseball

One night when I was over at Ravenlord's house visiting he and his brother, along with another friend of ours, I was playing with a pair of dice. I rolled and got 7. After two more rolls, again both times I rolled 7. I think we went up to some ludicrous number like 19-20 consecutive rolls and each time I rolled I ended up with 7.

4. ## Re: Ups and Downs of Randomness, Observation, Stats and Baseball

Originally Posted by savafan
One night when I was over at Ravenlord's house visiting he and his brother, along with another friend of ours, I was playing with a pair of dice. I rolled and got 7. After two more rolls, again both times I rolled 7. I think we went up to some ludicrous number like 19-20 consecutive rolls and each time I rolled I ended up with 7.
Would have been a good time to hit the Craps table!

(of course there are six combinations on two dice that add to seven so not that unheard of)

5. ## Re: Ups and Downs of Randomness, Observation, Stats and Baseball

I hit blackjack 4 times in a row playing 100 a hand one night. I will never gamble again.

I will never lie again, either.

6. ## Re: Ups and Downs of Randomness, Observation, Stats and Baseball

The most common problem people have with statistics is that they look at a series of events and say that they go against the norm, i.e. a series of 10 flips of a coin. Say you get 8 Heads and 2 tails. Some people would say it is an oddity. In reality its because they want to relate a series of events that are independent of each other. Each time you flip a coin, there is a 50 percent chance of getting heads, and a 50 percent chance of getting tails. The next time you flip the coin, the percentages stay the same reguardless of the outcome of the last flip, or series of flips. Thats why streaks can happen. But, in the same reguard, the larger the sample size, the closer to 50-50 your data. Now this is a nice cut and dry example with no outside factors like there are in baseball (pitchers faced, situation, weather, and so on). In baseball there are events that can be related to the success or failure to get a hit in each individual at bat. When a human becomes a factor, statistics are not as reliable, which I guess you could say that a person could not possibly flip a coin the same way every time with a coin. But with both of these cases, large sample sizes are the only way to make those errors less of a factor.

To sum it up, and pretty much probably repeat what GullyFoyle said, statistics are only used to predict the outcome of the overall outcome in the long term. If you use them to predict a singular event, the chances of being right are severly decreased.

7. ## Re: Ups and Downs of Randomness, Observation, Stats and Baseball

Originally Posted by GullyFoyle
Few people understand the concept of randomness and percentages in statistics.
...
Now imagine what people would have said about this hitter during that 1 in 14 at bat span.
...

GL

8. ## Re: Ups and Downs of Randomness, Observation, Stats and Baseball

Originally Posted by gonelong

GL
I second that. Very, very good points, and a clear message.

9. ## Re: Ups and Downs of Randomness, Observation, Stats and Baseball

Thanks for posting this.

10. ## Re: Ups and Downs of Randomness, Observation, Stats and Baseball

Let's face it. Most people (myself included at times) find in depth and abstract statistics a complete bore.

It's much easier to sit back, swill your beer, and say "Frickin LaRue, 1 for 14."

Most real baseball people (and even some fans) are much more patient and understaning when it comes to short-term vs. long-term trends.

I can't remember where I read this, but there was a study done and the results showed that luck was responsible for 4 runs per game while talent was responsible for 1 run per game. Over a 162 game season, luck evens out and talent shines through. The same can be said for individual talent, I would guess.

People just aren't that patient (me included.).

11. ## Re: Ups and Downs of Randomness, Observation, Stats and Baseball

Originally Posted by gonelong

GL
^^^ What GL said.

Excellent post, Gully.

12. ## Re: Ups and Downs of Randomness, Observation, Stats and Baseball

Originally Posted by dabvu2498
I can't remember where I read this, but there was a study done and the results showed that luck was responsible for 4 runs per game while talent was responsible for 1 run per game. Over a 162 game season, luck evens out and talent shines through. The same can be said for individual talent, I would guess.
along the same lines pretty much every team is going to win 60 games and lose 60 games. It's what you do with the other 40 that counts.

13. ## Re: Ups and Downs of Randomness, Observation, Stats and Baseball

Originally Posted by dabvu2498
Let's face it. Most people (myself included at times) find in depth and abstract statistics a complete bore.

It's much easier to sit back, swill your beer, and say "Frickin LaRue, 1 for 14."
....

People just aren't that patient (me included.).
Very true, and I'm as impatient as the next person.

It seems a lot of the most heated arguments on the board comes from people not realizing their impatience or their own subjectivity...

But that never happens anywhere else either /sarcasm :P

14. ## Re: Ups and Downs of Randomness, Observation, Stats and Baseball

Great post. This is why "playing the percentages" bugs me in so far as how it's typically used. Let's pretend our manager is named Lony RaTussa, and we have player A and player B.

Player A
Season line: .280/.350/.470
Career line vs. Pitcher Z: 5-24, 1 2B, , 6Ks

Player B
Season line: .250/.310./.390
Career line vs Pitcher Z: 4-9, 1 HR, 5 RBI

It seems like every game Mr. RaTussa coaches we get treated to some crap about how player B really hits Pitcher Z well and that's why he's getting the start.

Baseball gives managers too much opportunity to tweak and "play the percentages". As such, they are almost unamimously over-reacting to small sample size and failing to let things play out.

Of course, all this is horribly complicated by the fact that players are not static entities with fixed outcome percentages (like dice). Sometimes that .333 hitter really IS playing like crap and his 1-14 streak is indicitive not of a random streak, but of a change in his "true" ability. What separates the good managers from the bad ones is the ability filter out the randomness.

While Joe Torre has been given some of the best talent, I commend him for being willing to let things work themselves out, rather than tinker needlessly.

15. ## Re: Ups and Downs of Randomness, Observation, Stats and Baseball

Originally Posted by RedsManRick
Baseball gives managers too much opportunity to tweak and "play the percentages". As such, they are almost unamimously over-reacting to small sample size and failing to let things play out.

Of course, all this is horribly complicated by the fact that players are not static entities with fixed outcome percentages (like dice). Sometimes that .333 hitter really IS playing like crap and his 1-14 streak is indicitive not of a random streak, but of a change in his "true" ability. What separates the good managers from the bad ones is the ability filter out the randomness.
Good work! You'll notice that "playing the percentages" goes down in crunch time in late August and September and of course the playoffs. Then you typcially saddle the horses you rode in on.

I wish Manager RaTussa could get us there.

16. ## Re: Ups and Downs of Randomness, Observation, Stats and Baseball

A different thread on static vs ever changing line ups touched on this idea. A players average for success (whatever the measurement) is built over the course of many games that sometimes streaches over many, many seasons. Those stats (let's say OBP against lefties) have been created over the course of many streaks, many slumps, many lucky situations, many unlucky situations and some just plain freak occurances.

While all those situations average out to give us that OBP percentage, it's only a good indicatator of what a player is likely to do over another simular period of time (ie. if he's hit .312BA for the past 3 years, it's likly he'll hit .312BA in the next 3 years, only assuming that all things remain equal).

It's not a good predictor of what that player will do in a specific matchup on a given night. There are just too many variables to consider. His odds for success over the long haul of hitting against lefties will be .312 but his odds of getting a hit against a specific pitcher in a specifc at bat is not 30%.

This is why I'd rather coachs assign roles, put players in a psedu-static line up (I know that there will always be minor tweeks) and let the averages play out over the course of a season instead of trying to manage the averages of each given situation which is next to impossible.

