PDA

View Full Version : Sabermetrics usage: time for regression?



Brutus
08-11-2012, 07:15 PM
I remember several years ago, perhaps seven or eight, I was browsing this forum under a different incarnation of myself, and I had an enlightenment thanks to a, of all things, Adam Dunn discussion. At the time, I was an old-school, batting average type of guy that enjoyed the 1970's box scores with AB, R, H, BI and average as the holy grail. It's not that I didn't understand sabermetrics, but I refused to consider they were all that much more descriptive.

Seeing a discussion regarding OPS, runs created, etc., I took it upon myself to test the claims being made of these revolutionary metrics. Sure enough, using Excel and data collected from Baseball Reference, I found these stats blew away batting average when it came to correlation.

Since then, I've been hooked on the aforementioned metrics and the alphabet soup of statistics that have followed. While the "stats vs. scouts" argument doesn't really exist now, as these stats have somewhat saturated the mainstream and we accept there's a place for both, I feel it's getting to the point where the acceptance and widestream usage is due for a regression to the mean.

What do I mean by that?

Well, when these stats first appeared, they were useful in context, and still are. Their applicability derived from their ability to either correlate well to run scoring/prevention or predicting future performance. In other cases, they had a very high RMSE. But the problem is that people have since forgotten that correlation doesn't equal causation. And worse yet, some of the stats that correlate well, don't describe as much of the action as people portray.

Take for instance: FIP or xFIP. I'm as guilty as the next guy as trusting them more than ERA. And the reason for that is they've found to predict future performance of ERA better than ERA itself. However, there are two problems with using FIP or xFIP as gospel... one it assumes that pitchers can't control anything or very little in the field of play, which researchers are now finding is indeed false; and two, even to the extent pitchers might have minimal control, these stats don't describe as much as the variance as people might wish.

Even in the best samples, FIP or xFIP have a correlation of around .45. That's statistically significant, but that means these stats only explain roughly 20% of the variance of future ERA. I don't know about you, but that's not a ton. Funny thing is, in some samples, ERA actually correlates better to future ERA than FIP, although this is not often true in larger datasets. It is noteworthy, though, that SIERA -- a stat that takes into account GB and FB rates -- actually correlates better to future performance than the FIP and xFIP cousins of ERA estimators, which suggests that it's time to find middle ground in the idea that, indeeed, pitchers should be held accountable for some of what happens in the field of play.

Nonetheless, it's gotten to the point where our acceptance of these stats should probably regress to the mean. We automatically see a low BABIP by a hitter and assume he's unlucky without considering line drive rate, inability to run out infield hits, etc. We see a bloop hit or two, and consider it lucky or unlucky depending on the hitter/pitcher perspective. If someone uses OPS or another metric to define a pitcher, it's dismissed automatically because it doesn't narrowly stick to the three true outcomes narrative.

To be truthful, it's not a problem with the statistics, but rather an inability to take them into context and under the limitations in which they were conceived.

I'm not saying we should go back to using batting average. I'm glad that my understanding of the game has changed and that we can quantify as much of what happens as we can. What I'm saying is that it's time we use stats in context and in the world of gray areas rather than the black/white, lucky/unlucky absolutes that we've been dealing in the past few years. This isn't an indictment of the stats themselves, but rather the context in which we use them.

MikeS21
08-11-2012, 07:43 PM
I've got to leave soon, but I want to comment on this part of your past when I get a chance: it's the key to the whole traditional versus sabermetric debate.


I'm not saying we should go back to using batting average. I'm glad that my understanding of the game has changed and that we can quantify as much of what happens as we can. What I'm saying is that it's time we use stats in context and in the world of gray areas rather than the black/white, lucky/unlucky absolutes that we've been dealing in the past few years. This isn't an indictment of the stats themselves, but rather the context in which we use them.

Mario-Rijo
08-12-2012, 02:07 PM
Excellent post I do think some get carried away with these stats at times. But I know that I don't always jump to conclusions. For example when I see a low BABIP I think it's useful to look at the players history (if he has one) amongst other things (age, injury etc.) to help come to a conclusion. If the guy consistently has a low BABIP there is a reason for it of which there are many possibilities.

I don't know how many other posters here do the same but from the looks of it I'd wager somewhere around 75-80%. But at any rate thanks for writing it out more eloquently than I could have.

757690
08-12-2012, 04:44 PM
Nice post.

I don't see a regression moving forward, just an advancement of knowledge and understanding. Sabermetrics is at its very really stages, and there is a lot more we don't know about the game and it's stats than we do know.

I can see a day in the near future when FIP and UZR are looked at the same way we currently look at Batting Average and ERA.

Wonderful Monds
08-12-2012, 04:46 PM
Nice post.

I don't see a regression moving forward, just an advancement of knowledge and understanding. Sabermetrics is at its very really stages, and there is a lot more we don't know about the game and it's stats than we do know.

I can see a day in the near future when FIP and UZR are looked at the same way we currently look at Batting Average and ERA.

UZR will be forgotten completely once FieldFX comes around, IMO.

Mario-Rijo
08-12-2012, 08:05 PM
UZR will be forgotten completely once FieldFX comes around, IMO.

Thank goodness, UZR is pretty lame IMO. Some will say it's the best we have but IMO it's not even good enough to be considered more reliable than the eyes.

lollipopcurve
08-12-2012, 08:30 PM
UZR will be forgotten completely once FieldFX comes around, IMO.

Yes.

Measuring the game, and measuring performance, is going to be revolutionized here soon. Precise measurements of ball speed, spin and trajectory and player movement is going to pull the curtain back on the physics behind the game and shed light on all players in a new way.

WVRedsFan
08-13-2012, 12:28 AM
I love that many of you like to use the "new way" of judging baseball via stats, but for me it takes away from the enjoyment of the game, even though I've learned to look at hitters in terms of OBP and OPS. I still look at box scores after every game. I still look at ERA and hits per I Ning allowed. Maybe I'm just old, but if I have to do that much work, I'll just quit watching and follow something else. I appreciate your work and keep me informed, but watching the games is my love and they promised there would be no math :).

CrackerJack
08-13-2012, 12:44 AM
Yes.

Measuring the game, and measuring performance, is going to be revolutionized here soon. Precise measurements of ball speed, spin and trajectory and player movement is going to pull the curtain back on the physics behind the game and shed light on all players in a new way.

I predict this post will be replicated 5-10 years from now.

Vottomatic
08-13-2012, 06:58 AM
I love that many of you like to use the "new way" of judging baseball via stats, but for me it takes away from the enjoyment of the game, even though I've learned to look at hitters in terms of OBP and OPS. I still look at box scores after every game. I still look at ERA and hits per I Ning allowed. Maybe I'm just old, but if I have to do that much work, I'll just quit watching and follow something else. I appreciate your work and keep me informed, but watching the games is my love and they promised there would be no math :).

Agree wholeheartedly.

In my younger days I was a stat geek. But I feel like the excessiveness of stats is ruining the game for me.

The good ole eye test is still the best test. The actual game and a player's contributions are most enjoyable watching it happen.

_Sir_Charles_
08-13-2012, 10:02 AM
I love that many of you like to use the "new way" of judging baseball via stats, but for me it takes away from the enjoyment of the game, even though I've learned to look at hitters in terms of OBP and OPS. I still look at box scores after every game. I still look at ERA and hits per I Ning allowed. Maybe I'm just old, but if I have to do that much work, I'll just quit watching and follow something else. I appreciate your work and keep me informed, but watching the games is my love and they promised there would be no math :).

This is where I am as well. I've come to understand and appreciate the newer stats over the past few years...but those are things I want to look at AFTER the game...not during. A quick glance at ERA, BA, HR, RBI, W/L still serves me well enough while watching a game. The newer stats....I don't want to pull out my sliderule to figure out what they mean in order to understand someone's point during an in-game discussion. (yes, I'm exaggerating...the the point remains).

For me, the new stats do give me a greater understanding of the game, but they don't give me any greater ENJOYMENT of the game. No stat will do that. The eye test however DOES increase my enjoyment.

RedsManRick
08-13-2012, 11:10 AM
This is where I am as well. I've come to understand and appreciate the newer stats over the past few years...but those are things I want to look at AFTER the game...not during. A quick glance at ERA, BA, HR, RBI, W/L still serves me well enough while watching a game. The newer stats....I don't want to pull out my sliderule to figure out what they mean in order to understand someone's point during an in-game discussion. (yes, I'm exaggerating...the the point remains).

For me, the new stats do give me a greater understanding of the game, but they don't give me any greater ENJOYMENT of the game. No stat will do that. The eye test however DOES increase my enjoyment.

I don't see anything wrong with this. However, I would suggest that the stats you site aren't so easy because of some inherent quality they possess as much as it is simply what you grew up with. I mean, who can tell me the definition of an earned run or why a guy gets an RBI in some situations but not others. Tell me the definition of an "at bat".

I think it's important to remember that batting average and RBI were sabermetrics 100 years ago. It was driven by the exact same fundamental questions -- how can we use factual information to improve our understanding? Its not like most people are calculating batting average as they sit and watch the game or that as we kids we came up with the definition for an "at bat" ourselves.

Change is hard because it incurs a cost. As you and others have pointed out, you already have a system for assessing things that works pretty well. That's a system that you learned in your formative years and have continually refined since then. The added benefit of switching to a new system, even if it is marginally better, simply isn't worth the cost of giving up the knowledge you've built and spending a lot of effort learning something new that, as you point out, isn't core to your enjoyment of the game anyways. (By contrast, the numbers are core to my enjoyment of the game, so for me, learning the new system is worth it)

But for people just getting in to the game, especially kids, all these stats are new. They're going to have to learn and develop their own system in any event. And the availability of that information is fundamentally different. They aren't getting most of their stats from Sunday papers and baseball cards. (I was part of the last generation who did that). They don't have those memories of batting average being presented to them as the "most important" column that summed everything up. They're used to looking their stats on the internet where batting average and wOBA are almost equally easy to find.

I think that fundamentally most people don't like doing the math; they're there for the aesthetics, for the fan experience. They get the vast majority of their enjoying from simply watching the game and rooting for their team. In that context, batting average doesn't add any more or less than wOBA to their experience, but right now, batting average is still a little bit easier to find while you're watching the game. And if at some point the availability gap between the various performance stats becomes small enough, the thought becomes, if I'm going to spend any time/energy thinking about stats, I might as well use the ones that most accurately tell me what I want to know.

My hypothesis is that people don't have an inherent hunger to know how good of a "hitter" a guy is in terms of hits per at bat. The basic question that the inquisitive 7 year old has is "how much is what the guy does when he hits contribute the team scoring runs?" or put more simply "how good of a hitter is he?" For people who answered that question in their minds with batting average, HR and RBI for 20, 30, 40+ years, that's good enough. For the next generation, batting average in particular will seem silly. The basic counts of events, those will have staying power -- even RBI isn't easily replaced from a fan's perspective. But batting average, a stat which says that a single is much more like a HR than a walk, will become less and less important over the years. That's my guess at least.

IslandRed
08-13-2012, 11:30 AM
The basic question that the inquisitive 7 year old has is "how much is what the guy does when he hits contribute the team scoring runs?" or put more simply "how good of a hitter is he?"

You might have to up your age bracket a little bit. The typical baseball-playing seven-year-old hasn't evolved beyond "get a hit" and "don't strike out." :cool:

RedsManRick
08-13-2012, 11:38 AM
You might have to up your age bracket a little bit. The typical baseball-playing seven-year-old hasn't evolved beyond "get a hit" and "don't strike out." :cool:

Well, I don't think he's phrasing it quite like that, but I guarantee he's asking (or at least thinking) some form of "who is the best hitter?" And he's probably already forming some idea around the concept of not all hits beings equal. I know that for me personally, at 8 or 9 I was telling my coach (my dad) that if I just focused on not striking out that I couldn't hit homers. In reality, I couldn't hit homers regardless, but I didn't like the idea of swinging easier just to ensure I made contact...

camisadelgolf
08-13-2012, 11:49 AM
No team goes 19-8 without their best player unless they're in a regression or expecting one. In fact, no team has won at the rate in the NL for a full season in over 100 years, with or without their best player.

_Sir_Charles_
08-13-2012, 02:05 PM
I don't see anything wrong with this. However, I would suggest that the stats you site aren't so easy because of some inherent quality they possess as much as it is simply what you grew up with. I mean, who can tell me the definition of an earned run or why a guy gets an RBI in some situations but not others. Tell me the definition of an "at bat".

Exactly. It's more along the lines of familiarity. I know what a good batting average should be. I know what a good ERA looks like. Etc, etc. It's not about knowing how those formulas derive...it's an intuitive thing due to being so familiar with them for so long. Yes, baseball card/box score stats. Sure, the info is limited, but we're just talking about a general idea of how someone's season has progressed. It fills in enough blanks for me to get a clue how he's doing.