PDA

View Full Version : Baseball Statistic Question



EddieMilner
03-05-2009, 04:07 PM
I have been thinking about how to better represent success during an AB. I feel that someone else has thought of something close to this, so I was wondering if any one knew of a stat like this.

Basically its a look at how successfull a batter is at his plate appearance.

Each time a batter is up, there are a finite amount of options possible:
If no one is on he can:
1. Batter creates an out
2. Batter advances to first
3. Batter advances to second
4. Batter advances to third
5. Batter advances to home

I want to give a value between 0 and 1 for each of these instances to show the batters success rate during this time up. Therefore:
1.00 - Reached Home
0.75 - Reached Third
0.50 - Reached Second
0.25 - Reached First
0.00 - Recorded an out

So if there was a person on first and no outs the following outcomes would be possible:
1. Create Two Outs
2. Create an Out and base runner/batter stays on first
3. Create an out and base runner/batter advacnes to second
4. Create an out and base runner/batter moves to third
5. Create an out and base runner/batter scores
6. Create no out base runner advances to second, batter advances to first
7. Create no out base runner advances to third, batter advances to first
8. Create no out base runner advances to home, batter advances to first
9. Create no out base runner advances to third, batter advances to second
10. Create no out base runner advances to home, batter advances to second
11. Create no out base runner advances to home, batter advances to third
12. Create no out base runner advances to home, batter advances to home.

Now, in this situation, I would rank the highest values as scoring runs as opposed to not making outs.

So the points would be:
1.00 - Create no out base runner advances to home, batter advances to home.
.909 - Create no out base runner advances to home, batter advances to third
.818 - Create no out base runner advances to home, batter advances to second
.727 - Create no out base runner advances to home, batter advances to first
.636 - Create an out and base runner/batter scores
.545 - Create no out base runner advances to third, batter advances to second
.455 - Create no out base runner advances to third, batter advances to first
.364 - Create no out base runner advances to second, batter advances to first
.273 - Create an out and base runner/batter moves to third
.182 - Create an out and base runner/batter advances to second
.091 - Create an Out and base runner/batter stays on first
.000 - Create Two Outs

Then you basically average the value from each at bat to create some type of success rating.

Obviously the more batters you have on, the more possible outcomes, as well as how many outs should come into play as well. But there are still a finite amount of possibilities from any batting situation.

I understand that you are taking into account better base runners into the batters statistics and things of that nature, however I think it would give a better idea of how successful a batter is every time they get up to bat.

Different people may value not making an out higher than making an out and scoring a run, so that is definitely a debatable topic for this.

Does anyone know of an existing statistic that already does this, or something close to it?

Rusty the Red
03-05-2009, 04:41 PM
No, but you made my head hurt.

FlightRick
03-05-2009, 04:41 PM
What you're describing seems a bit like something called Linear Weights Batting Runs that I got a little bit interested in about a year ago.

The "standard" formula that I found for calculating Batting Runs included the following values:

+ 1.40 -- Home Run
+ 1.09 -- Triple
+ 0.78 -- Double
+ 0.47 -- Single
+ 0.33 -- Walk/Hit-by-Pitch
- 0.26 -- All Otherwise-undefined Batted-Ball Outs
- 0.29 -- Strike Out
- 0.72 -- Ground Into Double Play

A value of 0.00 was given to all other contingencies (sacrifices, reaching on an error, catcher's interference, etc.). Also, in some models they attempt to account for baserunning by applying a value of + 0.30 for a stolen base, and - 0.52 for a caught-stealing.

These values were arrived at by people far smarter than me, with far more powerful computers than mine, and with far too much free time on their hands, and they are apparently agreed upon by most who experiment with the Batting Runs stat (the only areas of disagreement I really found were in accounting for SB/CS and in the precise value of a K vs. a standard batted-out, though the value for the K didn't vary much more than a couple of hundredths no matter what).

The Batting Runs are generally summed over the course of a season or career, though my interest (which has since fizzled, because I'm flaky/lazy like that) was in calcultaing a Batting-Runs-per-Plate-Appearance stat, and then trying to mess around with the standard deviations of a player's BtRns/PA to see if I could find out anything cool related to the overall production of a steady BA/OBP type player (low deviation among PA) versus a feast-or-famine/Dunn-esque player (K's and HR's resulting in a high deviation).

Maybe I'll get that bug up my ass again this year at some point. Or maybe not. But in the interim, I think Batting Runs is a lot like what you describe, and versions of it are available on a few different sites. I'm pretty sure Baseball-Reference has it on each player's page.


Rick

JBChance
03-06-2009, 01:24 AM
What you're describing seems to rate a batter by things he can't control. Such as the base runners ability to advance or the opposing teams' ability to throw out runners, for instance.

Take these two ratings:

.818 - Create no out base runner advances to home, batter advances to second
.545 - Create no out base runner advances to third, batter advances to second

The differential between these two ratings is quite a bit, but it really isn't dependent on the batters' ability. It is more based on the base runners' ability.

As a batter, you can only control what you do: make an out, hit, or walk. I can understand why you might try to evaluate a batter on what he does in certain game-time situations, but I'm not sure it would be accurate and the stats would vary greatly from one player to the next depending on who's on base in front of him. Players that hit on a high OBP driven team would have much higher ratings than a hitter on a low OBP team, such as ours. What would be, for instance, the difference in rating between driving a guy in from second as opposed to driving in two guys from second and third? Would there be a difference?

I see the idea for it. I'm just not sure the rating would reflect an individual's ability to succeed. Maybe it could be used to rate an entire teams' ability to get people on and get them in.

Road Pop
03-06-2009, 02:07 AM
I need an Advil. I read it a couple times. Are errors figured in? I see where you're going... just can't grasp it.

TheNext44
03-06-2009, 02:57 AM
This is stat is very interesting. I would tweek your numbers a bit, most importantly, have negative value for outs, since they do have negative value since the hurt a teams chances of scoring.

First and foremost, this type of stat measures how productive a hitter has been in producing runs for his team. It says very little about the players ability and should not be used for projections.

However, this does seem to be an interesting way to figure out more precisely how many runs a individual player produces for his team. I think this has advantages over what is out there, and if properly calculated, it could proved a more accurate count of the runs each batter produces over a season.

Right now, the most commonly used formula is Runs Created. However, that really works best for figuring out how many runs a teams creates, and does not work so well when applied to individual players. Some fans swear by it, but even Bill James, who created it, has said that it is not that accurate on the individual level.

Estimate Runs Produced and BaseRuns, which is what FlightRick mentioned is also used, but many fans do not think that linear weights work, since baseball is not really linear.
My problem with it and with Runs Created, is that it uses averages over many seasons to figure out the right formula. It calculated that over a multi year period, on the average a single produced .47 runs. The problem with this is that it then treats all singles the same, whether it occurs with two outs and no one one, or with the bases loaded and no one out. It assumes that over a season, a player will have the same ratio of each situation as the goup of all players over that multi year period. I just don't see that has happening. Guy who hit in the middle of the order will hit more often with runners on base than a leadoff hitter, so they're singles really should be worth more.

But your system takes all of that into account. It actually sees how many runs a player actually produced in a given season. It simply adds up the values of each and every PA.

This basically is, if I understand it right, is a more detailed formulation of the linear value of each possible PA situation, than Expected Runs Produced. It would provide the value for not just every single, but for every single in every situation. And instead of figuring out a new formula, you could simply add up the value of the result every PA a player had over a season. A bit tedious, but I think if it provides a more accurate count of the runs as player produced, it would be time well spent.

Eric_the_Red
03-06-2009, 08:27 AM
I always liked the stat mentioned in Moneyball: SLG + (4 x OBP), the theory being a perfect SLG is 4.00 and a perfect OBP is 1.00, hence the 4x multiplier.

EddieMilner
03-06-2009, 09:41 AM
I need an Advil. I read it a couple times. Are errors figured in? I see where you're going... just can't grasp it.

Errors would be ignored. If you get to first, via any method, you get credit.

If a player is on first and you are up to bat, you receive the same points if you:
A. you get a single and the runner advances to second.
B. you get HBP and the runner advances to second.
C. you get a BB and the runner advances to second.
D. you get to first and the runner advances to second via an error.

In the end I feel that output is more important than how it actually occurred. If it was skillfully earned or earned via luck, who cares? Its the same outcome.

EddieMilner
03-06-2009, 09:44 AM
This is stat is very interesting. I would tweek your numbers a bit, most importantly, have negative value for outs, since they do have negative value since the hurt a teams chances of scoring.


I agree with tweaking the numbers. I just created them off the top of my head.

I guess my biggest issue with current statistical theories is an out is an out. I feel that is not the case. if you move the runner over and get an out (as long as its not the last out of the inning) then that is better than if you get an out and do not move the runner. I don't feel that the current statistical methods take this into consideration. However I could be wrong.

Newman4
03-06-2009, 05:47 PM
I think that the linear weights values are calculated based on how many runs are manufactured on average each time an event occurs. If I remember correctly they used stats for like the last 30 years to come up with the probability of each event occuring.

TheNext44
03-06-2009, 05:55 PM
I think that the linear weights values are calculated based on how many runs are manufactured on average each time an event occurs. If I remember correctly they used stats for like the last 30 years to come up with the probability of each event occuring.

I believe you are correct. I think what Eddie Milner is trying to do is to find out the calculations for even more events. Not just hits and outs, but hits and outs within each possible situation. It basically would be a more complete, more accurate way of figuring out how many runs a player actually produced for his team.

At the very least, it is an interesting idea.

JBChance
03-06-2009, 10:32 PM
I believe you are correct. I think what Eddie Milner is trying to do is to find out the calculations for even more events. Not just hits and outs, but hits and outs within each possible situation. It basically would be a more complete, more accurate way of figuring out how many runs a player actually produced for his team.

At the very least, it is an interesting idea.


An interesting idea, but not as tool to project future at bats for players. It gives more of a history of what a player/ team did, not what they will do. Kind of like avg. w/RISP. plus taking into account non-hits/ walks as well.

mroby85
03-06-2009, 10:34 PM
Can't you just watch games, and figure out who's good and who's not throughout the course of a season?

Newman4
03-06-2009, 11:42 PM
Linear weights are calculated using linear regression. Eddie Milner's plan is quite interesting and if it could be worked out would be remarkable. Unfortunately, the work involved would probably be collosal.

TheNext44
03-07-2009, 01:25 AM
Linear weights are calculated using linear regression. Eddie Milner's plan is quite interesting and if it could be worked out would be remarkable. Unfortunately, the work involved would probably be collosal.

Ahh, there's the rub.

But my iPhone remember where I parked my car and direct me to it within inches. I think if we can figure out a computer program that does that, we can come up with one that figures out Milner's plan.

Remember that the Expected Runs values took a while to figure out too.

TheNext44
03-07-2009, 01:26 AM
An interesting idea, but not as tool to project future at bats for players. It gives more of a history of what a player/ team did, not what they will do. Kind of like avg. w/RISP. plus taking into account non-hits/ walks as well.

Absolutely correct, not valuable for projections, but good at determining how productive players were in a given year. I think that is very valuable.

JBChance
03-07-2009, 02:58 AM
Absolutely correct, not valuable for projections, but good at determining how productive players were in a given year. I think that is very valuable.

I think that it could be valuable, too. I'm just trying to figure out how accurate it is and how it correlates to on-field performance of players individually as well as a team.

I'm thinking it may correlate more to team play because of the on base perspective; the point totals could be skewed, on an individual basis, by how adept the other players are at getting on base or place in the batting order. Those are things that are really not under control of the batter, per se.

Take an RBI discussion earlier in another thread. A lot of people thought that Votto and or EdE would have the most RBI or at least more RBI than Bruce because of where they would bat. Are they better hitters than Bruce? Maybe or maybe not, but they certainly should get more opportunities because of the non-OBP (Taveras/ SS) guys that will be in the 1 and 2 holes. Whoever is in the 4th or 5th slot certainly looks to be set to have more guys on and, therefore, get more RBI.

Now, due to having more of a chance to get RBI, wouldn't you expect those batters to create more runs and have a higher point total? A guy like EdE might actually have a higher point total than Bruce just due to his slot in the order. Or does the slot in the order say something about the ability of the player? In Jay's case, I don't think it does - everyone knows that Dusty will bat him 3rd and you can't penalize him for Taveras' lack of OBP skills.

If you look at it from a team perspective, it wouldn't have to split those hairs. It would just show how good a team is at getting players on base and how adept the team is at hitting with guys on base and how much the team scores. That would kind of disregard the whole place-in-the-order problem that would potentially give some players a higher point total.

Anyway, I certainly think its interesting to come up with a more accurate measurement or statistic for how runs were produced. I do like the idea of being able to give weights to different at-bat outcomes and include outs as productive, in some instances.

I'm sure there could be a program to sort and tabulate the info, but creating it would be a challenge, IMO. It's quite a few individual situations. It would take some sort of sophisticated statistical analysis application that would notate each opportunity and outcome in a very specific fashion.

Like I said, its an interesting idea. It would be cool to make it happen.

EddieMilner
03-09-2009, 12:09 PM
Does anyone know where I could get the score books from MLB games? I would rather use data. I live in Northern Indiana so all the Reds games are blacked out on the MLB.TV. And while watching the games after the fact is fun from time to time, I can't realistically do it 162 times this summer.

I might give it a try this year and have it as a google doc so others can help me keep it updated.