PDA

View Full Version : Distribution of Average Runs Scored per Game



PickOff
05-16-2008, 03:39 PM
I was reading the "John Eradi Gets it...” thread on ORG, which evolved (or devolved depending on your view) into a much visited discussion on what stats are most important to look at when determining a player's worth.

The general consensus is that Runs Created (http://en.wikipedia.org/wiki/Runs_created)sufficiently predicts the amount of runs a team (or player) will produce over the course of a year. There are other variations that will predict the runs scored to a higher degree, but at 96-97% accuracy for Runs Created you are only seeing marginal improvement.

OPS, Slugging, OBP, and AVG range from 85% to 71% with AVG being last in that grouping.

The question I had was what correlates with the variance or standard deviation of the average runs scored per game. This excerpt from the Hardball Times explains the importance of the distribution of runs scored.

http://www.hardballtimes.com/main/article/runs-per-game/


" ...if your league averaged five runs a game, and your team scored exactly five runs in every game, it would typically have a .600 winning percentage instead of .500, even though it had scored the average number of runs. That is the power of looking at distributions instead of averages.

To further illustrate the point, here is a table of the winning percentage of teams that scored exactly the following number of runs per game from 2000 through 2004, along with the incremental impact each run scored provided on winning percentage:


RS Win% Diff
0 .000
1 .077 .077
2 .208 .131
3 .339 .131
4 .471 .132
5 .593 .122
6 .686 .092
7 .776 .090
8 .840 .064
9 .874 .034
10 .921 .047
11 .939 .018
12 .963 .025
13 .987 .024
14 .978 -.009
15 .976 -.001
16 .983 .007
17 1.000 .017

In terms of winning ballgames, the second through the fifth runs have the most impact, followed by the sixth and seventh runs, and then the first run."

The implication is clear; the lower the variance of the average runs scored per game, the better record your team will likely have. To illustrate this another way, if the average runs per game were 5 for the NL in ’08, then the average total runs scored per team would be (5 x 162) 810. If the Reds managed to average 10 runs a game and score 1620 runs in ’08 you would expect them to be world beaters and have an extremely high winning percentage. If the Reds scored 0 runs in 81 games and 20 runs in 81 games, however, they will have averaged 10 runs a game but would at most only have a .500 winning percentage. This is why the distribution is so important.

Now, obviously, it would be best if a team could manage to just score one run more than the opposition for every game, and sometimes 10 or more runs would be needed to accomplish this, but the fact remains that if a team were able to always score their average the benefits of not having those lower scoring games outweighs the marginal gains of those runs scored above 7 runs.

Back to my question. What stat(s) best predicts which teams will have the lowest variation in runs scored per game? If the answer is AVG, for example, then we would need to strike a balance between RC and AVG in assessing a team’s likely performance or a player’s worth. If it were OPS, then that would give even greater support to OPS as a good metric to measure a player’s offensive worth.

Apparently, someone at BP took a look at whether “small ball” reduced the variance of runs scored per game – but found that it didn’t. I don’t know how they were defining “small ball”, but the postulation by some is that “small ball” will allow a team to be more consistent instead of relying on the long ball, and hence not get shut out as often.

What I am going to try to do is understand if there is any stat that shows a significant correlation to a low variance in average runs scored per game. I’m sure this has been done before somewhere on this site, and elsewhere as well, so if anyone has any results for this kind of thing please post it. I would also be interested to hear your thoughts on what stats you think might correlate to more consistent run production per game.