PDA

View Full Version : It’s okay to be mystified by linear weights



RedsManRick
03-09-2010, 12:41 PM
Over at Hardball Times, Joshua Fisher posted a great article about sabermetrics and the casual fan. In short, he makes the case that sabermetricians have failed to communicate the basic fundamentals of why the sabermetric approach is the right one when it comes to quantitative analysis (and make no mistake, citing a pitcher's wins or a batters batting average is quantitative analysis).

He then goes on to lay it out in very approachable language. I recommend it as a great read for everybody.

http://www.hardballtimes.com/main/article/its-ok-to-be-mystified-by-linear-weights/



It’s okay to be mystified by linear weights
by Joshua Fisher
March 09, 2010

Everywhere you turn this offseason, you can find a primer on all things sabermetric. Here are a couple good ones on wOBA and UZR. Here's a neat one on FIP. And heck, here's an entire online course on the state of the art. There's just so much good work being done by so many good minds right now that jumping on this train can seem a little scary. The various introductions written for new adopters fill an important niche, but I'm not sure they reach the demographic we're most concerned with winning over.

At our best, we're open-minded folks who take a reason- and logic-based approach to the game we love. At our worst, we're an avant garde gang of know-it-all cyber-bullies, ready and eager to viciously pounce on any Luddite who still worships at the altar of the run batted in. And I think we're at our worst more than we'd like to acknowledge.

Our arrogance comes from the strength of our position; we’re right about baseball and we know it. The problem is that things have become almost cultish; our alphabet-soup language poses a formidable barrier to entering the club. And that’s where these primers come in. If we can walk people through the silliness of pitcher wins and ERA, they’ll greet FIP with open arms. That’s the plan.

(click the link (http://www.hardballtimes.com/main/article/its-ok-to-be-mystified-by-linear-weights/)to read the rest)

TheNext44
03-09-2010, 01:23 PM
It's a fine article, but still has the arrogance that ticks most fans off about sabermetrics.


Our arrogance comes from the strength of our position; we’re right about baseball and we know it

I see most fans not into advanced stats turning off right there, and I don't blame them.

I think that attitude is what is wrong with a good part of the Saber community. It's not about being right or wrong, it's about being open mined and understanding of opposing views.

And just for the record, I am a huge Sabermetric nerd and I don't "greet FIP with open arms." In fact, I am very skeptical of it and think it will become irrelevant in the very near future. I may be wrong about this, but I can make a very strong argument for it that should embarrass anyone who thought that they justified in being arrogant about the correctness of FIP.

The irony is that I like Sabermetrics because it provided a more open minded approach to understanding baseball. It asked questions that hadn't been asked, and demanded that I look at what I had previously, arrogantly considered to be gospel, and reconsider them. I do not feel comfortable being arrogant about the correctness of these new stats, just as I should not have felt comfortable about the correctness of the old ones.

bucksfan2
03-09-2010, 01:28 PM
It's a fine article, but still has the arrogance that ticks most fans off about sabermetrics.


Our arrogance comes from the strength of our position; we’re right about baseball and we know it

I see most fans not into advanced stats turning off right there, and I don't blame them.

I think that attitude is what is wrong with a good part of the Saber community. It's not about being right or wrong, it's about being open mined and understanding of opposing views.

And just for the record, I am a huge Sabermetric nerd and I don't "greet FIP with open arms." In fact, I am very skeptical of it and think it will become irrelevant in the very near future. I may be wrong about this, but I can make a very strong argument for it that should embarrass anyone who thought that they justified in being arrogant about the correctness of FIP.

The irony is that I like Sabermetrics because it provided a more open minded approach to understanding baseball. It asked questions that hadn't been asked, and demanded that I look at what I had previously, arrogantly considered to be gospel, and reconsider them. I do not feel comfortable being arrogant about the correctness of these new stats, just as I should not have felt comfortable about the correctness of the old ones.

Thats exactly where I quite reading the article. Arrogance is something that I feel is completely unbecoming and can't stand.

RedsManRick
03-09-2010, 02:47 PM
It's a fine article, but still has the arrogance that ticks most fans off about sabermetrics.


Our arrogance comes from the strength of our position; we’re right about baseball and we know it

I see most fans not into advanced stats turning off right there, and I don't blame them.

I think that attitude is what is wrong with a good part of the Saber community. It's not about being right or wrong, it's about being open mined and understanding of opposing views.

I agree that it was off-putting for him to say what he did. But I actually think this is fundamental to his point. There is a difference between respecting somebody's right to have an opinion/ understanding where that opinion comes from and conceding that the "truth" cannot be known and that everything is mere opinion.

(I think the disconnect comes largely from sabermetricians often being inarticulate about their claims and to non-sabermetricians assuming that sabemetricians are claiming something broader and more definitive than they are. The sabermetricians often leave their base assumptions unstated because their target audience usually knows them already. The casual fan may not realize that the sabermetrician is fully aware of the analysis' limitations.)

But a complication to this is that the sabermetric approach includes a transparent way to test the accuracy of the method of evaluation. They show their work and anybody can engage them on the merits of the argument. By contrast, many conventional arguments are based on the logic of "I know it to be true" paired with an assertion that there is no way to verify the accuracy of the claim being made. For this person, everything is relative because nothing can be proven. You have an opinion. I have an opinion. We're both entitled to it. The end.

Perhaps a constructive starting point would be trying to get some agreement on how we would evaluate the "opinions" being presented. Because unless and until there is a willingness on both sides to concede the possibility they might be wrong and agree on the terms by which that would happen, the conversation will likely continue to be two entrenched sides arguing past each other.

It has been my observation that the sabermetric community, arrogant as it is, has demonstrated a willingness to engage in that conversation. The anti-sabermetric camp (e.g. Joe Morgan) does not seem willing to engage in the joint pursuit of truth as they either believe that all truth (in baseball) is relative or that they already know the truth and thus don't need to engage in a conversation based on the possibility that they might be wrong.

Sure, sabermetricians would be wise not to play the "we know we're right" card -- no matter how much they believe it. They should recognize that the other side often feels the exact same way. And because if your goal is get people to agree with you, you don't winning any points for actually being right if they disengage from the conversation. So, I agree. Sabermetricians should learn to say "We believe" instead of "We know" and operate from the "we might be wrong, but I've shown my work -- check it out" perspective.

But the other side should be willing to do the same. If they can't put their distaste for the messenger aside and engage in the message on its merits, then we should just agree to disagree and get on with our lives. Whoever is actually "right" will probably win out in the long run.

westofyou
03-09-2010, 02:53 PM
Worst thing about sabermetrics is that often it is enwrapped in poor writing. Quality writing conveys ideas in a manner polar opposite to the work that poor writing can accomplish, even if they are both trying to say the same thing.

bucksfan2
03-09-2010, 03:46 PM
Perhaps a constructive starting point would be trying to get some agreement on how we would evaluate the "opinions" being presented. Because unless and until there is a willingness on both sides to concede the possibility they might be wrong and agree on the terms by which that would happen, the conversation will likely continue to be two entrenched sides arguing past each other.

Some concession needs to be given on both sides. I agree 100% with that. The issue is there is no full proof method, and either side won't ever achieve that. Anytime you are dealing with the human element there wont be a 100% correct way. But I do think it leads to interesting debates, so I like the gap on both sides.


It has been my observation that the sabermetric community, arrogant as it is, has demonstrated a willingness to engage in that conversation. The anti-sabermetric camp (e.g. Joe Morgan) does not seem willing to engage in the joint pursuit of truth as they either believe that all truth (in baseball) is relative or that they already know the truth and thus don't need to engage in a conversation based on the possibility that they might be wrong.

There is a disconnect between analysts and actual performers. Always has been and always will be. With the Joe Morgan example, on one had you a have a guy who made his living being one of the greatest 2b in the history of baseball. A guy who spend his career playing the game and then turning around and announcing it. He is going to see the game much differently than outsiders. On the other hand you have a side who is trying to assign a number or quantify the game.

To me this is akin to a MBA telling his boss, someone who has been in the field for 30 years, "Umm, sir the numbers say this is the way to do it" with the boss replying "son I have been in this business for 30 years, we are doing it this way." One may be right or wrong or the answer is somewhere in the middle, but I don't think either side is going to give up too much.



Sure, sabermetricians would be wise not to play the "we know we're right" card -- no matter how much they believe it. They should recognize that the other side often feels the exact same way. And because if your goal is get people to agree with you, you don't winning any points for actually being right if they disengage from the conversation. So, I agree. Sabermetricians should learn to say "We believe" instead of "We know" and operate from the "we might be wrong, but I've shown my work -- check it out" perspective.

IMO you should always use "believe" unless something is written in stone. You may think that player A is a X WAR play and is better than player B, while the opposite side may believe that player b is better than player A using other measurements. Is one side more right or wrong?

I think one thing that really drove me away from the saber side is when someone said that a computer could do a better job managing a team than the current Reds manager. The belief that numbers are far superior to the human aspect of the game really turned me away.


But the other side should be willing to do the same. If they can't put their distaste for the messenger aside and engage in the message on its merits, then we should just agree to disagree and get on with our lives. Whoever is actually "right" will probably win out in the long run.

There have been plenty of times where we agree to disagree only to see a thread go 20 more pages. Some of the best debates on RZ are basically an agreement to disagree that lingers. And I do agree there should be a give and take on both sides.

RedsManRick
03-09-2010, 05:59 PM
There is a disconnect between analysts and actual performers. Always has been and always will be. With the Joe Morgan example, on one had you a have a guy who made his living being one of the greatest 2b in the history of baseball. A guy who spend his career playing the game and then turning around and announcing it. He is going to see the game much differently than outsiders. On the other hand you have a side who is trying to assign a number or quantify the game.

To me this is akin to a MBA telling his boss, someone who has been in the field for 30 years, "Umm, sir the numbers say this is the way to do it" with the boss replying "son I have been in this business for 30 years, we are doing it this way." One may be right or wrong or the answer is somewhere in the middle, but I don't think either side is going to give up too much.

I would argue that it's different. Joe Morgan doesn't have experience in measuring the value of individual plate appearance outcomes. He can't tell you whether teams tend to score more runs with a guy on 3rd and 1 out or a guy on 2nd and 0 outs -- and there's an actual answer to that question. However, he likes to assert that he knows it through his experience as a player. I'm sure he's got a sense of it, but sabermetrics can without a doubt address this question more accurately.

Now, "To bunt or not to bunt?" is not the exact same question as "Do teams tend to score more runs with a guy on 3rd and 1 out or a guy on 2nd and 0 outs?", but I bet that latter question is an awfully big part of the calculus for the first. And too often managers and pundits answer the first question without knowledge of the answer to the 2nd factored in.

Just like I would begin to suggest that I know the mechanics of base-stealing or how to announce a good game, it would be helpful all around if people were more honest about what they truly were experts in. Not all of baseball's questions can be boiled down to numbers -- but a lot of them can be to a large degree. Perhaps we should spend a bit more time clarifying the question being asked and a little less time exchanging answers to questions that sound similar, but are fundamentally different at the end of the day.

Sabermetricians are surely guilty of discounting how much of the "right answer" exists outside of their ability to obtain it through quantitative analysis. But similarly, the anti-sabermetric crowd is far too enamored with chemistry, magic, and intuition and unwilling to accept that for some things, the numbers go far beyond what can be gleaned with the eye and the gut.



IMO you should always use "believe" unless something is written in stone. You may think that player A is a X WAR play and is better than player B, while the opposite side may believe that player b is better than player A using other measurements. Is one side more right or wrong?

Well, when you compare sabermetric-based team projection systems to project standings, they are significantly more accurate than the "expert predictions" put together without the benefit of the analysis. You're right in asserting that we can't always know which way is more right. But usually the conversation doesn't even get to a point where the "sides" can agree on what right and wrong look like.

I'd like to see the sides agree on how to access their evaluative approaches more often so there is the potential for agreement on who ends up being right or wrong. One BIG tenet of this is that the non-sabermetric side needs to understand at least one statistical principal: small sample sizes. One, two, or three examples of something is not proof - it's evidence. You need many examples of success doing something a certain way compared to the other to make a compelling argument either way. Without an agreement on this, we'll never get anywhere.

And on the flip side, getting it wrong a few times does not invalidate your approach. In baseball perhaps more than any other sport, good decisions can lead to bad outcomes. But over the long haul, good decisions will lead to the best outcomes on the whole.



I think one thing that really drove me away from the saber side is when someone said that a computer could do a better job managing a team than the current Reds manager. The belief that numbers are far superior to the human aspect of the game really turned me away.

That was me. I said it in jest, as I was really talking about the aspects of managing that required in game decisions such as who to pitch, when to run, and what the lineup would be. I still believe that.



There have been plenty of times where we agree to disagree only to see a thread go 20 more pages. Some of the best debates on RZ are basically an agreement to disagree that lingers. And I do agree there should be a give and take on both sides.

I think it we spent a little more time (on both sides) using a Socratic approach, we'd be better off for it. Ask probing questions that reveal inconsistencies in thinking or errors in logic. Lead people to the right questions and hint at the answers. Simple assertions will most often be rejected as they trigger your standard fight or flight response.

edabbs44
03-09-2010, 07:51 PM
I would argue that it's different. Joe Morgan doesn't have experience in measuring the value of individual plate appearance outcomes. He can't tell you whether teams tend to score more runs with a guy on 3rd and 1 out or a guy on 2nd and 0 outs -- and there's an actual answer to that question.

Generally, does that really make a difference when making a decision as to what do you in the context of a game? Unless there was a stat to tell me what the odds were with Stubbs on 3rd with one out and Votto/Phillips/Rolen coming up with Lincecum on the mound and Affeldt/Wilson warming or with Stubbs on 2nd with OCab coming up and everything else the same, situations are going to weigh heavily on the decision making process.

The answer to the question you mention is fun as a party trick, but in practical use it doesn't work like the 2 point conversion chart.

Ron Madden
03-10-2010, 03:38 AM
I'm 56 years old and have always been "old School" until recently.

I could be wrong but it seems to me most sabermetricians are far and away more open minded about finding answers to or discussing questions about baseball than most so called "old school fans".

I believe that both sabermetricians and old school fans love the game of baseball, I would think if a person truly loved (someone) or something they would strive for a better understanding of (who) or what they love so much.

In my humble opinion most "old School Fans" tend to be stubborn and set in there ways. I was like that too.

If we just open our minds and discuss our differences we all might learn a thing or two.

jojo
03-10-2010, 05:26 AM
The writer could have just as easily said: "Our confidence in our conclusions comes from the strength of our position; we strive to utilize the best process possible when forming arguments and drawing conclusions and while process doesn't guarantee being correct all of the time, the conclusions we reach will be more likely to be correct than ones based upon alternative approaches"...

Mario-Rijo
03-10-2010, 05:29 AM
I'm 56 years old and have always been "old School" until recently.

I could be wrong but it seems to me most sabermetricians are far and away more open minded about finding answers to or discussing questions about baseball than most so called "old school fans".

I believe that both sabermetricians and old school fans love the game of baseball, I would think if a person truly loved (someone) or something they would strive for a better understanding of (who) or what they love so much.

In my humble opinion most "old School Fans" tend to be stubborn and set in there ways. I was like that too.

If we just open our minds and discuss our differences we all might learn a thing or two.

Now RM you know that's not most guys. How many guys you know that have tried to truly understand their wives? You know they would take a bullet for them but wouldn't take a dancing class? My grandmother loved 2 men in her lifetime (to date), one was my grandfather who worked his life away providing her with a nice comfortable life and a little something to keep her after he was gone. The other was a guy who she dated for several years after my grandfather was gone who left her nothing but lots of fun memories of love, living and laughter. Guess who she secretly (secret except to me) pines for? Not the stubborn old father of 5 who she spent 50+ years with (although she does love, appreciate and miss him), but the guy who she spent about 6-7 years with and understood what she wanted and just enjoyed giving it to her. I don't know many of the latter fellas although I know many of the former who claim to be the latter but for their own selfish reasons aren't really.

Same can be said for this situation, lots are stubborn and they aren't gonna change because frankly it doesn't fit into their schedule. And if it did they don't have time to learn a whole new language just to interpret what is being taught to them. No Most folks don't care to learn and the ones who do are put off due to arrogance or ignorance (sometimes others and sometimes their own). I for one had no idea of the existence of sabermetrics whatsoever when I stumbled on to this site roughly 5 years ago now. What kept me here early was the fact that RZ was a one stop shop for all things Reds related. It had all the rumors, all the information one might need to know for the upcoming season etc. I avoided Sabre content because I didn't know what the heck was being discussed. But I would start giving my opinion on other things and guys would challenge it at times and I'd argue tooth and nail until someone would say something that made their argument make sense to me and they would usually get that across by using lamens terms. Often times it would take a long time to get thru to me, sometimes multiple arguments. All of this resulted in me finally tiring of being wrong (I am a competitor) and pulling out a dictionary sometimes, or delving into a Wiki page on OBP% or whatever. Before long some quit responding to what I said and still not sure how many just have me on ignore because I am so damn stubborn and fiery (some will have a different word for it) or how many just don't disagree with much me anymore.

Point is (after all that wind) most don't care about the subject, they have their mind made up and that is the end of it. Some get curious but can't get it for various reasons (time, lack of comprehension, arrogance of the teachers) and they bail. Some can get it if you can explain it in a way that makes sense and once they realize how it can be valuable to them they will do the rest. So if I were explaining it to someone new I would 1st figure out how it can be valuable to them and express it to them in a way they can comprehend 1st and foremost. I keep thinking how I would explain UZR to a guy who called in on the banana phone and explain to him why it is important right away, the Denzell quote from Philadelphia the movie comes to mind "Explain it to me like I'm a 5th grader" but do it in a way that isn't arrogant (unless they need for it to be).

Hope that helps in some way.

mth123
03-10-2010, 06:24 AM
The writer could have just as easily said: "Our confidence in our conclusions comes from the strength of our position; we strive to utilize the best process possible when forming arguments and drawing conclusions and while process doesn't guarantee being correct all of the time, the conclusions we reach will be more likely to be correct than ones based upon alternative approaches"...

Or he could have said, we're confident in our position because we throw numbers out there like Runs Created, WAR, UZR, FIP and other theoretical values for Runs, Wins and Dollars assigned to individual players with no possibility to to prove them right or wrong through the observation of actual events yet we claim they are right so they must be.

jojo
03-10-2010, 08:57 AM
Or he could have said, we're confident in our position because we throw numbers out there like Runs Created, WAR, UZR, FIP and other theoretical values for Runs, Wins and Dollars assigned to individual players with no possibility to to prove them right or wrong through the observation of actual events yet we claim they are right so they must be.

I guess he could have said that too but it wouldn't have been a very compelling statement since many of the metrics cited are compiled from mounds of data (i.e. actual events).

RedsManRick
03-10-2010, 11:51 AM
Or he could have said, we're confident in our position because we throw numbers out there like Runs Created, WAR, UZR, FIP and other theoretical values for Runs, Wins and Dollars assigned to individual players with no possibility to to prove them right or wrong through the observation of actual events yet we claim they are right so they must be.

In what way can gut based decisions can be "proven right or wrong" that saber-metrically driven decisions cannot be? Gut based decisions have no advantage in this regard -- except that you can't show your work.

Can you directly prove that a guy's UZR is correct? No, because UZR isn't an actual thing. It's a measurement of the value of on events as they relate to run production. But similarly, the feeling a guy has in his gut that "the guy is a really good defender" isn't a real thing that can be proven either. Both methods of evaluation, in the case of defense, are attempting to get to the same conversation and suffer from the same challenge. At least the sabermetric approach shows its work and can have its process examined, tweaked, and tested.

By what method you test whether the scout was "right or wrong"? By asking other scouts? Both systems are most easily tested by their own internal consistency. If you want to link them to something observable on the field, please tell me how you'd do it -- because we can apply the logic to both the scout's opinion and UZR.

By the nature of defense, there's simply no way to "prove" how many runs were scored or prevented due to a guys defense -- but that's what coaches and GMs have been doing using their guts for a century. Sabermetrics is simply going about the process in a more scientific way where the inputs to the process are publicized, the biases can be challenged, and the work is shown. Surely you aren't suggested that evaluating defense using solely one's eyes and gut can be held to the same level of account?

But let's look at some of the other metrics. FIP was designed as a predictor of ERA. Guess what? By running analyses of FIP projected ERA and actual observed ERA, it's been shown to the best predictor. As for the dollar values assigned using WAR projections, compare it to what guys actually sign for and you'll find it's actually really freaking good at it.

Bottom line is that sabermetric metrics are no different that other approaches, they just show their work better. At the end of the day, GMs, managers, etc. are all coming up with their own assessments and predictions. When they've been compared in theri ability to measure and predict what's likely to happen on the field, sabermetrics tend to come out ahead.

Is it perfect? Of course not. But that's the beauty of the scientific approach, it's open. If a talking head at ESPN makes a prediction or a comment about the value of something that happened in the past, there is no way to verify how he reached at that conclusion. With sabermetrics there is. And if there is a way to assess the accuracy of the conclusions, sabermetric approaches tend to "win".

Perhaps the easiest example of this is projected standings. Do a little googling and you'll find a number of studies which compare the accuracy of sabermetric projections of team win totals against the so-called experts. Over time, the sabermetric projections win out easily. If this isn't proof in the usefulness of the sabermetric approach -- and let's be honest, sabermetricians openly recognize that the approach has its limits -- then I'm not sure any evidence will satisfy you.

mace
03-10-2010, 12:17 PM
I agree with about 90-95% of this article, much as I agree with about 90-95% of advanced sabermetrics. The point at which I peel off is clearly put in the article. It's principle number one:

"1. Baseball is an individual sport.

This is perhaps the most important concept a person can understand about baseball."

My disagreement is that it's NOT an individual sport. Not entirely, anyway. If it were wholly an individual sport, what value would there be to Justin Lehr teaching Homer Bailey the split-finger; to Scott Rolen showing Brandon Phillips and younger Reds how to go about their business in a professional manner; to John Franco convincing David Weathers to step back and slow things down in a tight situation; to Derek Jeter setting a precedent for Yankees to stick around Tampa during the offseason and train in the team facility; to the second baseman who fakes a baserunner back to the bag and thereby prevents him from scoring on the ensuing single; to the veteran who renegotiates his contract (Rolen again, for example) so that a team can sign a free agent (Chapman, for example); to Graig Nettles identifying the pitches to come and passing along that information to Lou Piniella; to Kevin Millar firing up the Red Sox, down three games to none and having just given up 19 runs to the Yankees, before game four of the 2004 LCS; to any player who gives assistance, confidence or good advice to another?

There are no personal stats for any of the examples above. But there are team results for all of them. It's a team game. Maybe not as much as football or basketball or hockey or soccer; but it is, nevertheless.

M2
03-10-2010, 02:42 PM
IMO, the top problem dogging the stats community at the moment is that many of the newer stats aren't particularly good. At best they fail to shed new light on the game. At worst the numbers function only within their own vacuums.

For instance, wOBA is perfectly fine, but it's not telling me anything I didn't know before it existed. A dozen other stats were already clueing us is as to who the top hitters in the game are. While I appreciate the mathematical cleanliness of not double-counting hits (which is the problem with OPS), wOBA is overkill. A giant "So what?" in the well-mapped continent of offensive performance. Honestly, I've got to believe that a lot of people look at wOBA with the same befuddlement that overcomes them with they hear the latest street monicker for the word "house." Does the world really need another one of these?

FIP combines two poison pills: It's an incomplete slice of pitcher performance and it's predictive of nothing. Feel free to belittle ERA as much as you wish, but it is telling you something worth knowing (how many runs a pitcher is allowing). Yes, it has its blind spots. Yet we know what those blind spots are and can correct for them. FIP tells us that pitchers who strike out a lot of hitters while limiting homers and BBs are really good. Wow. Who'd have guessed? BTW, these pitchers also tend to be hard to hit, but FIP isn't going to tell you that directly.

So there's something like 10 good starting pitchers in MLB and then everything begins to get sticky. A guy like Bronson Arroyo exerts enough control to overcome the longballs he surrenders and his relative lack of Ks. A.J. Burnett gets enough whiffs and limits the HR enough to overcome his wildness. FIP will tell you Burnett is a better pitcher. Maybe he is, but when you park-adjust their ERAs, Arroyo is just as effective (and he consistently throws more innings). So what do you care about? Innings and performance or satisfying the dictates of a single stat? The point here is that after you get past the elite pitchers in the game, you've got to delve into complexities that FIP doesn't unravel.

There will be those who point out that ERA isn't terribly predictive of anything and they will be right. It isn't. Again, we know its blind spots and have spent many years correcting for them. FIP isn't meaningfully predictive either. So why bothering using it as an ERA replacement if I've got to do the same amount of correcting for it? I care about the runs. I don't particularly care about the latest good faith efforts to adequately weight the relative values of walks, homers and strikeouts while eschewing all other data.

And that doesn't even touch upon the standing issues with using linear weights, which don't go away no matter how much some numbers crunchers want to ignore them. That's an internal argument in the sabermetric community which doesn't really have much bearing on the game itself, but it is possible that the past 10 years of statistical interpretation have been spent running down a dead-end alley.

UZR we've been over. I'll just note that many UZR adherents were claiming the Yankees were headed for a defensive nightmare before the start of last season, when the team finished 7th overall in DER. Supposedly that was impossible with the personnel they put in the field. Perhaps that reflects a healthy dose of luck. Or perhaps that reflects unforeseen aid from the pitching. Or perhaps the Yankee fielders were better than UZR recognized. Doesn't really matter which one you want to pick, it means defense is still an enigma for the most part.

That enigma status is why the sabermetric community keeps trying to unravel defense. It's a noble pursuit, but people aren't admitting to what they don't know. For instance, Sports Illustrated ran a lengthy piece on how the Mariners are onto Moneyball 2.0 with run prevention. So supposedly everybody wants to be like them. Why? They got outscored by 52 runs last season and played 10 wins above their Pythag. Who in their right mind wants to model their team on that? That's not even a model, it's a happy accident. If the Mariners play the exact same way again, they've got a vastly better chance of losing 90 games than winning 90 games.

Anyway, trying to boil all of this down into a workable summary, I think a lot of the problem with the stats community at the moment is that it has forgotten the stuff it was right about and begun to push a bunch of stats that are, if we're being honest, theoretical in nature. There needs to be a greater recognition of which numbers really matter to baseball fans and which ones exist primarily for stats hobbyists.

Falls City Beer
03-10-2010, 03:02 PM
Good stuff, M2. I'll only add: it's an applied science. But it's not even really a science, as far as that goes.

bucksfan2
03-10-2010, 03:08 PM
Very nice post M2!

lollipopcurve
03-10-2010, 03:10 PM
Good to see you back, M2.

Couldn't agree more with this:


There needs to be a greater recognition of which numbers really matter to baseball fans and which ones exist primarily for stats hobbyists.

Personally, I can't get over the suggestion that it takes 3 years of data to determine with any certainty how good a defender someone is. No team can make decisions based on a metric like that, so far as I can tell.

RedsManRick
03-10-2010, 04:10 PM
Good post, M2. But I think you've stumbled in to one of the traps. Just because sabermetrics is using data and math to make projections doesn't mean it's projections are perfectly accurate. It's not about getting perfect information, making perfect projections, and making decisions that work out well every time. It's about getting better information, making better projections, and making better decisions that are more likely to work out.

A great example of this is the fan scouting report Tango does. There is no "data" in these reports. It's just an aggregation of the input from amateur scouts everywhere. But this crowd-sourcing of information falls under the sabermetric umbrella. (and interestingly, it serves as great baseline for comparison against defensive metrics)

The sabermetric community has no doubt failed to communicate this point. Frankly, this is my biggest pet peeve and criticism of the intellectualism in our society generally. There's a lot of stuff that we simply don't know, cannot know, and cannot predict. The world is way too complicated. Every projection carries big error bars around it. Everybody tacitly understands this, particularly in their own field. But they often fail to apply it when others share their analyses and projections. So long as we're talking about projections and not predictions, nobody is asserting perfect knowledge.

As aside, projections and predictions are very different things which often get used interchangeably. Projections convey a set of possible outcomes, with some sense of likelihood attached. PECOTA is a great example of this. Unfortunately, most people don't quite understand this and the people making the projections exacerbate the confusion by only providing the most likely outcome. And that's where predictions come in. Prediction is choosing one projecting and stating that you believe it will occur. When you see sabermetricians post a future stat line or standings, understand that this is a projection -- the single value being shared is merely peak of the mountain of possibilities.

In baseball in particular, the one truism I would love to see better understood is this: performance and true talent are different things. Projecting the future is about understanding the latter. It just so happens that past performance is one of the two big pieces of input in to estimating true talent (qualitative assessment being the other). Sabermetrics largely focuses on getting better about estimating true talent. And in so far as we're getting better at projecting future performance, it seems that we're making progress.

(An aside, FIP does project something meaningful -- how many runs a pitcher will allow given number of innings in the future. Sure, you could get a marginally more accurate number by including park factors, defense, GB/FB tendencies, etc., but just as you point out about OPS vs. wOBA, FIP (or some other defense independent pitching statistic) is the best simple tool we have for understanding a pitcher's true talent level.

RedsManRick
03-10-2010, 04:18 PM
Personally, I can't get over the suggestion that it takes 3 years of data to determine with any certainty how good a defender someone is. No team can make decisions based on a metric like that, so far as I can tell.

Can you elaborate? I think it's pretty straight forward. It takes 3 years of data before UZR (in particular) stabilizes enough for it to be a good estimate of true talent and thus have predictive value. Before that, the noise that's inherent in defensive performance is too great. Sure, you can use it to

Now, you can certainly argue that you can get better information, more quickly via scouting. But one of the fundamentally challenges of evaluating defense is that it's hard to pin down a way of defining what good defense is. Assuming it's simply who prevents the most runs, you can't really see that with your eyes -- you can only get a pretty good sense of the player's ability to contribute to overall run production. And one of the challenges here is that you still have to have an idea about what skills/events contribute to run prevention in what proportion -- the range vs. hands debate. That decision is being made by GMs every day, regardless of whether or not they're putting numbers to it.

The other advantage here is being able to value skills together. Given limited resources, how do you know how much to pay a guy. How do you know if his bat justifies his glove --- or visa versa. I guess you could use your intuition if you'd like, but again, the same calculus is being made. UZR, VORP and other stats which convert performance in to the common currency of runs simply put the calculus on paper so that you are consistent in applying it and can more easily engage others. I don't think anybody would suggest making decisions on the basis of WAR alone... But whether or not you're using a number to describe it, all GMs are coming up with an estimate of overall expected production.

Falls City Beer
03-10-2010, 04:31 PM
Good post, M2. But I think you've stumbled in to one of the traps. Just because sabermetrics is using data and math to make projections doesn't mean it's projections are perfectly accurate. It's not about getting perfect information, making perfect projections, and making decisions that work out well every time. It's about getting better information, making better projections, and making better decisions that are more likely to work out.

A great example of this is the fan scouting report Tango does. There is no "data" in these reports. It's just an aggregation of the input from amateur scouts everywhere. But this crowd-sourcing of information falls under the sabermetric umbrella. (and interestingly, it serves as great baseline for comparison against defensive metrics)

The sabermetric community has no doubt failed to communicate this point. Frankly, this is my biggest pet peeve and criticism of the intellectualism in our society generally. There's a lot of stuff that we simply don't know, cannot know, and cannot predict. The world is way too complicated. Every projection carries big error bars around it. Everybody tacitly understands this, particularly in their own field. But they often fail to apply it when others share their analyses and projections. So long as we're talking about projections and not predictions, nobody is asserting perfect knowledge.

As aside, projections and predictions are very different things which often get used interchangeably. Projections convey a set of possible outcomes, with some sense of likelihood attached. PECOTA is a great example of this. Unfortunately, most people don't quite understand this and the people making the projections exacerbate the confusion by only providing the most likely outcome. And that's where predictions come in. Prediction is choosing one projecting and stating that you believe it will occur. When you see sabermetricians post a future stat line or standings, understand that this is a projection -- the single value being shared is merely peak of the mountain of possibilities.

In baseball in particular, the one truism I would love to see better understood is this: performance and true talent are different things. Projecting the future is about understanding the latter. It just so happens that past performance is one of the two big pieces of input in to estimating true talent (qualitative assessment being the other). Sabermetrics largely focuses on getting better about estimating true talent. And in so far as we're getting better at projecting future performance, it seems that we're making progress.

(An aside, FIP does project something meaningful -- how many runs a pitcher will allow given number of innings in the future. Sure, you could get a marginally more accurate number by including park factors, defense, GB/FB tendencies, etc., but just as you point out about OPS vs. wOBA, FIP (or some other defense independent pitching statistic) is the best simple tool we have for understanding a pitcher's true talent level.

I'm not speaking for M2, but I almost certainly think he understands all the above points, and really hasn't fallen into any trap.

From my standpoint, I like the pure science-ish lack of arrogance and joy of discovery for its own sake of sabremetrics, as well as quite a bit of the rigor and legwork that goes into it. I think it's ingenuous and I think it stumbles upon many important useful pieces. I also don't view it as a sect or a camp--as many of its detractors attempt to paint it.

Nevertheless, it falls short of what I think some of its less-humble thinkers claim it is: a science. It's not, and it never will be. It's reasonably healthy for any baseball fan (or GM or owner) to acknowledge that as a premise.

jojo
03-10-2010, 04:44 PM
I'm not speaking for M2, but I almost certainly think he understands all the above points, and really hasn't fallen into any trap.

From my standpoint, I like the pure science-ish lack of arrogance and joy of discovery for its own sake of sabremetrics, as well as quite a bit of the rigor and legwork that goes into it. I think it's ingenuous and I think it stumbles upon many important useful pieces. I also don't view it as a sect or a camp--as many of its detractors attempt to paint it.

Nevertheless, it falls short of what I think some of its less-humble thinkers claim it is: a science. It's not, and it never will be. It's reasonably healthy for any baseball fan (or GM or owner) to acknowledge that as a premise.

Sabermetrics as an activity asks questions, uses data/statistics to find an answer and the work/conclusions generally get challenged by others who are interested in similar issues. I guess one can argue about whether it is a science or not but I don't see the point.

Does "sabermetric analysis" help us understand baseball and does it do so in a way that is unique in so much that other approaches can't provide similar insight? For the most part, absolutely.

BTW, the concept of replacement level/marginal value, DIPs theory (FIP), defensive analysis, wOBA and it's ability to translate into runs, WAR (and it's correlation to team RS/RA and market values) aren't hobby stats. They're significant advances in our understanding of the game and they have impacted how major league clubs make decisions.

Seriously, labeling "statheads" as arrogant is mostly an ad hominen rationalization for ignoring an awful lot of contribution.... Tango can be insufferable. That's a separate issue than whether his contributions have increased understanding of the game (or to word it differently whether a lot of insight can be achieved by entertaining his arguments). Several major league clubs have thought enough of his contributions to pay him for what they perceive to be his insights.

bucksfan2
03-10-2010, 04:44 PM
Can you elaborate? I think it's pretty straight forward. It takes 3 years of data before UZR (in particular) stabilizes enough for it to be a good estimate of true talent and thus have predictive value. Before that, the noise that's inherent in defensive performance is too great. Sure, you can use it to

In order to get an accurate UZR you need 3 years of data. In order to get an accurate judgment of ones defensive ability, using scouting, it doesn't take nearly as long. Heck I would venture to say that anyone who watch the Reds towards the end of last season got a pretty good idea of the kind of defense Drew Stubbs can provide. It didn't take much watching the Reds last year to know what Scott Rolen was a vast upgrade over Edwin.

Do you need to find the exact difference between a Edwin and Rolen? Will that difference using advanced numbers be valid enough to validate the difference? I would imagine you could walk up to any Reds pitcher and as him how big of a difference it was with Rolen at 3b instead of Edwin and the answer would be pretty staggering but if asked to quantify their answer you would get a confused look.


Now, you can certainly argue that you can get better information, more quickly via scouting. But one of the fundamentally challenges of evaluating defense is that it's hard to pin down a way of defining what good defense is. Assuming it's simply who prevents the most runs, you can't really see that with your eyes -- you can only get a pretty good sense of the player's ability to contribute to overall run production. And one of the challenges here is that you still have to have an idea about what skills/events contribute to run prevention in what proportion -- the range vs. hands debate. That decision is being made by GMs every day, regardless of whether or not they're putting numbers to it.

I really think that it is very difficult to define good defense, and put a number on the amount of runs saved. Too much subjectivity in the eventual answer. Lets look at CF and look at different things that could have an impact on a given ball. Is the CF positioned wrong? Is the CF favoring the LF because he is poor defensively? Does the CF get a good jump? Did the CF slip? Is the CF playing his first games in a new stadium? Did he get a bad read off the ball? Was the CF distracted by the sun/lights? How is that all factored in correctly to a defensive rating?

I remember reading an article about Andruw Jones in his heyday. It was about how the Braves pitchers loved how Jones played CF and loved how he played a more shallow CF than most in the league. How can that be calculated into Jones defensive ratings as well as the extra confidence a pitcher gets on the mound. Granted the pitchers weren't exactly chop liver.


The other advantage here is being able to value skills together. Given limited resources, how do you know how much to pay a guy. How do you know if his bat justifies his glove --- or visa versa. I guess you could use your intuition if you'd like, but again, the same calculus is being made. UZR, VORP and other stats which convert performance in to the common currency of runs simply put the calculus on paper so that you are consistent in applying it and can more easily engage others. I don't think anybody would suggest making decisions on the basis of WAR alone... But whether or not you're using a number to describe it, all GMs are coming up with an estimate of overall expected production.

Sure you want the most exact answer as possible. I think that is why both sides are brought to the table when deciding on signing a player or not. Although I think when you are dealing with amateur players scouting becomes of the utmost importance.

jojo
03-10-2010, 04:55 PM
In order to get an accurate judgment of ones defensive ability, using scouting, it doesn't take nearly as long.

How can one really know that?


Do you need to find the exact difference between a Edwin and Rolen?

The more data and the higher the quality of the data, the better the decision... or as you put it:


Sure you want the most exact answer as possible. I think that is why both sides are brought to the table when deciding on signing a player or not. Although I think when you are dealing with amateur players scouting becomes of the utmost importance.

edabbs44
03-10-2010, 05:02 PM
How can one really know that?

Do you think that it is difficult for an objective person to recognize a good defensive player without the use of "advanced statistics?"

dougdirt
03-10-2010, 05:17 PM
FIP combines two poison pills: It's an incomplete slice of pitcher performance and it's predictive of nothing. Feel free to belittle ERA as much as you wish, but it is telling you something worth knowing (how many runs a pitcher is allowing). Yes, it has its blind spots. Yet we know what those blind spots are and can correct for them. FIP tells us that pitchers who strike out a lot of hitters while limiting homers and BBs are really good. Wow. Who'd have guessed? BTW, these pitchers also tend to be hard to hit, but FIP isn't going to tell you that directly.
But the problem is, that "we", being all fans, don't know that ERA has its blind spots and that is isn't predictive.

dougdirt
03-10-2010, 05:22 PM
Do you think that it is difficult for an objective person to recognize a good defensive player without the use of "advanced statistics?"

I think it depends on the person. Lots of people handed Derek Jeter gold glove after gold glove, yet he is known to be a horrible defender for almost his entire career.

Falls City Beer
03-10-2010, 05:25 PM
Sabermetrics as an activity asks questions, uses data/statistics to find an answer and the work/conclusions generally get challenged by others who are interested in similar issues. I guess one can argue about whether it is a science or not but I don't see the point.

Does "sabermetric analysis" help us understand baseball and does it do so in a way that is unique in so much that other approaches can't provide similar insight? For the most part, absolutely.

BTW, the concept of replacement level/marginal value, DIPs theory (FIP), defensive analysis, wOBA and it's ability to translate into runs, WAR (and it's correlation to team RS/RA and market values) aren't hobby stats. They're significant advances in our understanding of the game and they have impacted how major league clubs make decisions.

Seriously, labeling "statheads" as arrogant is mostly an ad hominen rationalization for ignoring an awful lot of contribution.... Tango can be insufferable. That's a separate issue than whether his contributions have increased understanding of the game (or to word it differently whether a lot of insight can be achieved by entertaining his arguments). Several major league clubs have thought enough of his contributions to pay him for what they perceive to be his insights.

I'm not sure anyone's argued that sabremetric analysis hasn't impacted the sport tremendously.

edabbs44
03-10-2010, 05:37 PM
I think it depends on the person. Lots of people handed Derek Jeter gold glove after gold glove, yet he is known to be a horrible defender for almost his entire career.

These voters don't watch these guys play game after game and, for the most part, could be considered to be less than objective.

Watch the games and focus on Jeter and you can see that he isn't the best fielder at that position.

dougdirt
03-10-2010, 06:14 PM
These voters don't watch these guys play game after game and, for the most part, could be considered to be less than objective.

Watch the games and focus on Jeter and you can see that he isn't the best fielder at that position.

The people who vote for the gold gloves are the managers and coaches of the major league teams.

pahster
03-10-2010, 06:43 PM
Good stuff, M2. I'll only add: it's an applied science. But it's not even really a science, as far as that goes.

If it's a science, its practitioners are the worst scientists ever. A lot of what I read seems to be entirely atheoretical in nature. People seem to be using "theory" as a pejorative term in this thread, which is a mistaken position to take. The results of statistical tests hold no meaning without a theory to drive the inferences we make.

The other major problem with a lot of the stuff people who fancy themselves statisticians produce is that they don't really know what they're doing. I rarely if ever see the results of a regression; instead, I see the predictions produced by it. These predictions may not even mean anything. I never see standard errors, which makes me assume that the person who did the work either doesn't know what they are or knows his or her results wouldn't hold up if they reported the them (I have no doubt, for example, that the 95% confidence intervals around UZR point estimates are enormous).

jojo
03-10-2010, 06:48 PM
These voters don't watch these guys play game after game and, for the most part, could be considered to be less than objective.

Watch the games and focus on Jeter and you can see that he isn't the best fielder at that position.

"These voters" are managers and coaches....i.e. the guys making up the lineups etc.... Who watches more games?

Mario-Rijo
03-10-2010, 06:51 PM
"These voters" are managers and coaches....i.e. the guys making up the lineups etc.... Who watches more games?

Well when does Dusty watch Chase Utley other than 6ish games a year?

edabbs44
03-10-2010, 06:53 PM
The people who vote for the gold gloves are the managers and coaches of the major league teams.

Who have better things to do than watching every player on every pitch. Do you think Dusty is watching the nationals every night and seeing what Zimmerman is doing? He sees him 6 times per year (while he is also managing a team) and whatever highlights make ESPN when he has the time to catch it.

jojo
03-10-2010, 06:58 PM
If it's a science, its practitioners are the worst scientists ever. A lot of what I read seems to be entirely atheoretical in nature. People seem to be using "theory" as a pejorative term in this thread, which is a mistaken position to take. The results of statistical tests hold no meaning without a theory to drive the inferences we make.

The other major problem with a lot of the stuff people who fancy themselves statisticians produce is that they don't really know what they're doing. I rarely if ever see the results of a regression; instead, I see the predictions produced by it. These predictions may not even mean anything. I never see standard errors, which makes me assume that the person who did the work either doesn't know what they are or knows his or her results wouldn't hold up if they reported the them (I have no doubt, for example, that the 95% confidence intervals around UZR point estimates are enormous).

Since you mentioned UZR.......tango and MGL wax poetic about confidence intervals and regression etc.... Guys like Pinto, Russell Carleton (aka pizza cutter) etc actually get criticized by antistat people for being too pendantic statistically....

Lets be careful when conflating anyone who quotes a stat on the internet with the smaller group group of individuals who are pushing the envelope so to speak...

All of that said, what is the confidence intervals for the guy scouting highschool centerfielders?

jojo
03-10-2010, 06:59 PM
Well when does Dusty watch Chase Utley other than 6ish games a year?

How many scouts watch any player more than 6ish games? Dusty played the game too...his eyes should be golden shouldn't they?

jojo
03-10-2010, 07:03 PM
Do you think that it is difficult for an objective person to recognize a good defensive player without the use of "advanced statistics?"

Ignoring the straw in your question, you seem to be arguing in this thread that a guy who played 19 seasons in the majors and managed in the majors another 16 years can't do it by watching a guy play a week's worth of games...

I'm not so conflicted...I think the best answer comes from combining scouting and sabermetrics....

Mario-Rijo
03-10-2010, 07:16 PM
How many scouts watch any player more than 6ish games? Dusty played the game too...his eyes should be golden shouldn't they?

Not to be a wise guy but how do you know how many times a scout is seeing a guy? From high school to the pros he may see this guy dozens of times.

M2
03-10-2010, 07:26 PM
Good post, M2. But I think you've stumbled in to one of the traps. Just because sabermetrics is using data and math to make projections doesn't mean it's projections are perfectly accurate.

I don't know what I typed to make you think I think projections should be perfectly accurate, but I don't think. Never have. I've known stats projections are an imperfect science for decades and have never under the delusion that they'll ever be better than a general estimate.


(An aside, FIP does project something meaningful -- how many runs a pitcher will allow given number of innings in the future. Sure, you could get a marginally more accurate number by including park factors, defense, GB/FB tendencies, etc., but just as you point out about OPS vs. wOBA, FIP (or some other defense independent pitching statistic) is the best simple tool we have for understanding a pitcher's true talent level.

I couldn't disagree more. In fact, every FIP study I've ever seen has been a yet another proof point that FIP isn't very good at projecting future performance (certainly never falling close to within a range that a discerning study would consider "good" at projecting future performance). I read the studies and I've yet to encounter one that's impressed me. Certainly I've yet to see an argument that FIP is telling me more than I'd get from a standard stats line. Far as I can tell, FIP can tell you Tim Lincecum is a good pitcher, yet it gets really funky when confronted with pitchers who overcome their flaws ... and that's most pitchers. So it's a stat that I'm going to want to dig beneath, just like ERA, which I'm already digging beneath quite happily. And in the meantime, when push comes to shove and I just want to know who had the best season regardless of underlying factors, I'll take ERA over FIP every time.

M2
03-10-2010, 07:34 PM
Tango can be insufferable.

On his good days. Honestly, I don't think anyone has set statistical interpretation as far back as Tango. He's joyless and most of what he does is create a cacophony of numbers that exist mostly to serve themselves.

pahster
03-10-2010, 07:36 PM
Since you mentioned UZR.......tango and MGL wax poetic about confidence intervals and regression etc.... Guys like Pinto, Russell Carleton (aka pizza cutter) etc actually get criticized by antistat people for being too pendantic statistically....

Lets be careful when conflating anyone who quotes a stat on the internet with the smaller group group of individuals who are pushing the envelope so to speak...

All of that said, what is the confidence intervals for the guy scouting highschool centerfielders?

I'm mostly talking about BP and stuff I read on Fangraphs.

Even Tango doesn't get into much detail, at least not in The Book. It's got a decent appendix which shows that they know what the binomial and multinomial distributions are, but I see two pages that discuss confidence intervals for wOBA and nothing else. Where's the regression output from which they produce their run expectancies? I see what I presume are the coefficients, but I don't see any standard errors/t-values/z-values/p-values/confidence intervals or anything else. Is the errors heteroscedastic? Does the model suffer from multucollinearity (it should)? Of so, did they correct for it? How did they correct for it? I don't know because it never gets reported.

I'm not familiar with Pinto or Carleton, so I can't really comment on their work.

As for the scout, if they're good a GM should be fairly confident in their evaluations. If they're not, they shouldn't be. There's really no other choice when it comes to high school athletes because the disparity in talent is often enormous and the level of competition is far from uniform, not to mention the fact that some kids are still physically growing.

dougdirt
03-10-2010, 07:40 PM
Not to be a wise guy but how do you know how many times a scout is seeing a guy? From high school to the pros he may see this guy dozens of times.

Because scouts tend to have certain assignments. The big league scouts don't watch the minor league guys. Most minor league scouts tend to stick to a certain level, though some do range levels. The guys who scout the high school and college players aren't usually the same ones who scout the minor leaguers.

Falls City Beer
03-10-2010, 07:50 PM
All of that said, what is the confidence intervals for the guy scouting highschool centerfielders?

This is sort of the problem in thinking--the argument goes: sabremetrics ain't perfect, but they're the best game in town. Pretty soon for some, they become the only game in town--groupthink clusters around a set of opinions or conclusions and then organizations are once again wandering around in the dark or in-fighting. Now that's not the fault of pure sabremetrics of course; but as ideas get applied to methods (and remember most systems like businesses don't grasp nuance terribly well), a whole lot can get lost. So maybe the extent of the efficacy of sabremetric methods should wait on its own verdict. In other words, I'm not convinced that the narrative of sabremetrics as the doctrine of MLB GM-ing is ready to be written or that it should be written.

westofyou
03-10-2010, 07:51 PM
I'm not familiar with Pinto or Carleton, so I can't really comment on their work.


Pinto used to be a stats guy for BBT on ESPN

http://www.baseballmusings.com/

RedsManRick
03-10-2010, 08:11 PM
If it's a science, its practitioners are the worst scientists ever. A lot of what I read seems to be entirely atheoretical in nature. People seem to be using "theory" as a pejorative term in this thread, which is a mistaken position to take. The results of statistical tests hold no meaning without a theory to drive the inferences we make.

The other major problem with a lot of the stuff people who fancy themselves statisticians produce is that they don't really know what they're doing. I rarely if ever see the results of a regression; instead, I see the predictions produced by it. These predictions may not even mean anything. I never see standard errors, which makes me assume that the person who did the work either doesn't know what they are or knows his or her results wouldn't hold up if they reported the them (I have no doubt, for example, that the 95% confidence intervals around UZR point estimates are enormous).

This is a great point. There are people who a crappy job in every profession or hobby and sabermetrics is no exception. Of course, it's important not to throw out the baby with the bathwater. Though because of the nature of it, with the (occasionally) complex math and numerous abbreviations, it's all the more difficult to tell a bad sabermetrician from a good one.

I would definitely LOVE to see more stats communicated along with their 95% CI. You want a single season UZR figure, here it is. But if you want to use it to estimate true skill, be aware that all we can say is that it falls within a 15 run spread (my made up number) either way...

As Jojo pointed out, if only there was a way to produce the confidence intervals for scouts. The same principles of variance and sample sizes apply...

Though I think we can all agree on the point of the article I posted to start the thread. Sabermetricians can do a much better job communicating their findings and openly discussing the limitations of their approach. I wonder if the scouting community (and any other profession which involves making projections and measuring value...........) is willing to do the same?

In the best organizations, I don't think this is even a debate. All information should be brought to the table and considered in the context of what it tells us -- and what it doesn't. It's a shame that personal hubris and ineffective messengers get in the way of this happening with the baseball fan community at large.

edabbs44
03-10-2010, 08:11 PM
Ignoring the straw in your question, you seem to be arguing in this thread that a guy who played 19 seasons in the majors and managed in the majors another 16 years can't do it by watching a guy play a week's worth of games...

I'm not so conflicted...I think the best answer comes from combining scouting and sabermetrics....

Sitting in the dugout isn't really the best vantage point around, especially when you are really worried about a multitude of other things. Difficult to appreciate a specific fielder in that way, let alone 9 of them.

mth123
03-10-2010, 08:53 PM
If it's a science, its practitioners are the worst scientists ever. A lot of what I read seems to be entirely atheoretical in nature. People seem to be using "theory" as a pejorative term in this thread, which is a mistaken position to take. The results of statistical tests hold no meaning without a theory to drive the inferences we make.

The other major problem with a lot of the stuff people who fancy themselves statisticians produce is that they don't really know what they're doing. I rarely if ever see the results of a regression; instead, I see the predictions produced by it. These predictions may not even mean anything. I never see standard errors, which makes me assume that the person who did the work either doesn't know what they are or knows his or her results wouldn't hold up if they reported the them (I have no doubt, for example, that the 95% confidence intervals around UZR point estimates are enormous).

Exactly.

Its just a mathematical way of making a swag, but since there is a formula involved its sold as an exact science and "they know they are right."

RedsManRick
03-10-2010, 09:27 PM
Exactly.

Its just a mathematical way of making a swag, but since there is a formula involved its sold as an exact science and "they know they are right."

Just curious, but who is selling it as an exact science? I don't know of anybody who does, just of occasional characterization as such from the anti-sabermetric, non "basement-dwelling" crowd who aren't familiar with the common understanding regarding variability and the projection/prediction business I described above. No intelligent person who uses math to produce projections and valuations professional believes their end numbers to be absolute truth; just the best guess we can make given the available information. As for me personally, when I said science, I was referring to the scientific approach of hypothesis, rigorous testing, and open peer review of methodologies.

As for the "they know they are right" comment, I don't believe the author was referring to the possession of absolute truth. Rather, it is that they are "right" insofar as the principles of the sabermetric approach are sound, that quantitative analysis produces meaningful, useful insights, and that you don't have to be in the industry to "know how it works".

mth123
03-10-2010, 09:57 PM
Just curious, but who is selling it as an exact science? I don't know of anybody who does, just of occasional characterization as such from the anti-sabermetric, non "basement-dwelling" crowd who aren't familiar with the common understanding regarding variability and the projection/prediction business I described above. No intelligent person who uses math to produce projections and valuations professional believes their end numbers to be absolute truth; just the best guess we can make given the available information. As for me personally, when I said science, I was referring to the scientific approach of hypothesis, rigorous testing, and open peer review of methodologies.

As for the "they know they are right" comment, I don't believe the author was referring to the possession of absolute truth. Rather, it is that they are "right" insofar as the principles of the sabermetric approach are sound, that quantitative analysis produces meaningful, useful insights, and that you don't have to be in the industry to "know how it works".

Go back and read some of the threads on here.

Statistical analysis is a good thing. But the sooner people understand that its the starting point for understanding and not the final word on the matter the better off those who are its biggest proponents will be.

jojo
03-10-2010, 10:51 PM
Go back and read some of the threads on here.

Statistical analysis is a good thing. But the sooner people understand that its the starting point for understanding and not the final word on the matter the better off those who are its biggest proponents will be.

You're really overselling your position....

Ron Madden
03-11-2010, 03:42 AM
I'm still a babe in the woods when it comes to statistical analysis.

I've come to trust offensive stats like OBP, SLG and OPS. I believe in the Pythagorean theorem.

These stats are not 100% predictable but they come pretty darn close.

So far, I have very little faith in defensive metrics but I'll try to keep an open mind.

RFS62
03-11-2010, 09:12 AM
Fantastic post, M2

jojo
03-11-2010, 09:38 AM
Not to be a wise guy but how do you know how many times a scout is seeing a guy? From high school to the pros he may see this guy dozens of times.

How many players (or what percentage of players) really does a scout get to watch "dozens of times" before he fires a "definitive" report off to the suits and ties? It's an honest question...

nate
03-11-2010, 09:40 AM
Go back and read some of the threads on here.

Statistical analysis is a good thing. But the sooner people understand that its the starting point for understanding and not the final word on the matter the better off those who are its biggest proponents will be.

The same can be said for proponents of traditional stats as well as the "I use my eyes" crowd.

membengal
03-11-2010, 09:44 AM
I am still mentally high-fiving M2's post.

lollipopcurve
03-11-2010, 10:04 AM
You know, statistically-derived knowledge is one thing, but it is not the only thing, and I think the folks who have a deep interest in statistically-derived knowledge tend to overlook that there are different paths to knowing important things about baseball players. And not only are there different paths, there are shortcuts...


Consider these quotes from Malcolm Gladwell's book Blink:

There can be as much value in the blink of an eye as in months of rational analysis.

If we are to learn to improve the quality of the decisions we make, we need to accept the mysterious nature of our snap judgments.

Truly successful decision making relies on a balance between deliberate and instinctive thinking.

You can learn as much - or more - from one glance at a private space as you can from hours of exposure to a public face.

In my view, good scouts can arrive at accurate judgments about players faster than statisticians can collect enough data about those players that they feel their judgments can be trusted. This is especially important in amateur scouting, of course, but also true in the minor leagues and major leagues. I am not saying that good scouts are infallible, only that they have the ability to make accurate judgments more often than not -- and the speed with which they can do so makes their methodology superior in an important way to a stats-based approach. Now, statistical advances have been helping scouts broaden their approach, but I suspect that for the best scouts it's not by as wide a margin as some would like to believe.

So, stating things generally, organizations that trust their scouts -- that trust what those guys see in what seems like a blink of an eye, compared to the "exposure time" stats guys are using -- are going to move more quickly and be more flexible than organizations that always wait until the samples are statistically significant.

Falls City Beer
03-11-2010, 10:40 AM
The same can be said for proponents of traditional stats as well as the "I use my eyes" crowd.

Though the opening post is what we're really talking about here--I'm not sure there's been an active preference for the "scout" position. I think some bristle at the article in the opening post and its veiled dig at people who they think can't do some pretty basic statistical analysis. Quantum physics it ain't. I think a bunch of folks understand the math used by sabremetricians but still question their efficacy. That's a perfectly defensible position.

jojo
03-11-2010, 10:44 AM
You know, statistically-derived knowledge is one thing, but it is not the only thing, and I think the folks who have a deep interest in statistically-derived knowledge tend to overlook that there are different paths to knowing important things about baseball players. And not only are there different paths, there are shortcuts...


Consider these quotes from Malcolm Gladwell's book Blink:

There can be as much value in the blink of an eye as in months of rational analysis.

If we are to learn to improve the quality of the decisions we make, we need to accept the mysterious nature of our snap judgments.

Truly successful decision making relies on a balance between deliberate and instinctive thinking.

You can learn as much - or more - from one glance at a private space as you can from hours of exposure to a public face.

In my view, good scouts can arrive at accurate judgments about players faster than statisticians can collect enough data about those players that they feel their judgments can be trusted. This is especially important in amateur scouting, of course, but also true in the minor leagues and major leagues. I am not saying that good scouts are infallible, only that they have the ability to make accurate judgments more often than not -- and the speed with which they can do so makes their methodology superior in an important way to a stats-based approach. Now, statistical advances have been helping scouts broaden their approach, but I suspect that for the best scouts it's not by as wide a margin as some would like to believe.

So, stating things generally, organizations that trust their scouts -- that trust what those guys see in what seems like a blink of an eye, compared to the "exposure time" stats guys are using -- are going to move more quickly and be more flexible than organizations that always wait until the samples are statistically significant.

It's true IMHO that the earlier in player development one goes, the less efficacious (less useful) sabermetrics becomes. That said, I don't know of any leading stathead (i.e. the guys who push the boundries with their work and either consult for mlb teams or are prominent sabermetric writers) that doesn't argue for a marriage of stats and scouting. The "stat-only is the only approach" view just simply isn't a majority-held stathead view. Even when this notion was being oversold by Lewis in Moneyball, the focus was on college players-i.e. risk was being managed by shorting the player development process...

It is also true that relying on the eyes (and lets not forget the eyes are connected directly to the brain so there's more going on then just blinking in the scouting process) can result in faster decisions (or maybe more correctly stated earlier decisions). That said, moving very quickly is only a positive thing if the movement is in the right direction....

And that's kind of the point with sabermetrics...

Lets assume a model where the sabermetrics only approach uses the best possible metrics, the scouting only approach uses the best possible scouting system and the blended approach uses the best possible for both poles on the spectrum.

The blended approach should be expected to be right a higher percentage of the time than the approaches at either pole.

Now the reality is that it's a lot messier than that as life is a bell curve (continuum of all the possible combinations..i.e. not all saber or scouting efforts are equal so there are lots of combinations)... Reliance on lousy metrics hurts the decision making process. Reliance on lousy scouting hurts the decision making process. When the two endeavors align, it's a good sign (what are the chances lousy stats and lousy scouting are going to continually give the same answer in a competent FO?)....

The ideal is a blend because it should be expected to produce the largest amount of the highest quality data...

Is it really the statheads who are bristling at that argument?

RedsManRick
03-11-2010, 11:04 AM
Regarding scouts and stats, I'm reminded of the Reagan quote "trust, but verify." I think this goes both ways. Both sides are trying to figure out same things, but both have blind spots. Everybody benefits when they are open to more information....

As for sabermetrics in general, I came across this blog post through Tango's blog and found it quite relevant to this conversation. More than the metrics or conclusions, I think the biggest frustration from the sabermetric perspective is the lack of the "statistical thinking" approach taken by many who aren't familiar or comfortable with sabermetric techniques.

The basic thesis the author puts forth is that, in the quest for knowledge, keep an open mind -- and it is good advice for everybody at the table.

http://flowingdata.com/2010/03/04/think-like-a-statistician-without-the-math/



Think like a statistician – without the math
Posted Mar 4, 2010 to Data Design Tips, Statistics / 49 comments

Think like a statistician – without the math

I call myself a statistician, because, well, I'm a statistics graduate student. However, ask me specific questions about hypothesis tests or required sampling size, and my answer probably won't be very good.

The other day I was trying to think of the last time I did an actual hypothesis test or formal analysis. I couldn't remember. I actually had to dig up old course listings to figure out when it was. It was four years ago during my first year of graduate school. I did well in those courses, and I'm confident I could do that stuff with a quick refresher, but it's a no go off the cuff. It's just not something I do regularly.

Instead, the most important things I've learned are less formal, but have proven extremely useful when working/playing with data. Here they are in no particular order.

Attention to Detail

Oftentimes it's the little things that end up being the most important. There was this one time in class when my professor put up a graph on the projector. It was a bunch of data points with a smooth fitted line. He asked what we saw. Well, there was an increase in the beginning, a leveling off in the middle, and then another increase. However, what I missed was the little blip in the curve in the first increase. That was what we were after.

The point is that trends and patterns are important, but so are outliers, missing data points, and inconsistencies.

See the Big Picture

With that said, it's important not to get too caught up with individual data points or a tiny section in a really big dataset. We saw this in the recent recovery graph. Like some pointed out, if we took a step back and looked at a larger time frame, the Obama/Bush contrast doesn't look so shocking.

No Agendas

This should go without saying, but approach data as objectively as possible. I'm not saying you shouldn't have a hunch about what you're looking for, but don't let your preconceived ideas influence the results. Because if you go to length looking for some specific pattern, you're probably going to find it. It'll just be at the sacrifice of accurate results.

Look Outside the Data

Context, context, context. Sometimes this will come in the form of metadata. Other times it'll come from more data.

The more you know about how the data was collected, where it came from, when it happened, and what was going on at the time, the more informative your results and the more confident you can be about your findings.

Ask Why

Finally, and this is the most important thing I've learned, always ask why. When you see a blip in a graph, you should wonder why it's there. If you find some correlation, you should think about whether or not it makes any sense. If it does make sense, then cool, but if not, dig deeper. Numbers are great, but you have to remember that when humans are involved, errors are always a possibility.

redsmetz
03-11-2010, 11:07 AM
You know, statistically-derived knowledge is one thing, but it is not the only thing, and I think the folks who have a deep interest in statistically-derived knowledge tend to overlook that there are different paths to knowing important things about baseball players. And not only are there different paths, there are shortcuts...


Consider these quotes from Malcolm Gladwell's book Blink:

There can be as much value in the blink of an eye as in months of rational analysis.

If we are to learn to improve the quality of the decisions we make, we need to accept the mysterious nature of our snap judgments.

Truly successful decision making relies on a balance between deliberate and instinctive thinking.

You can learn as much - or more - from one glance at a private space as you can from hours of exposure to a public face.

In my view, good scouts can arrive at accurate judgments about players faster than statisticians can collect enough data about those players that they feel their judgments can be trusted. This is especially important in amateur scouting, of course, but also true in the minor leagues and major leagues. I am not saying that good scouts are infallible, only that they have the ability to make accurate judgments more often than not -- and the speed with which they can do so makes their methodology superior in an important way to a stats-based approach. Now, statistical advances have been helping scouts broaden their approach, but I suspect that for the best scouts it's not by as wide a margin as some would like to believe.

So, stating things generally, organizations that trust their scouts -- that trust what those guys see in what seems like a blink of an eye, compared to the "exposure time" stats guys are using -- are going to move more quickly and be more flexible than organizations that always wait until the samples are statistically significant.

You raise several good points. I think it's critical to remember that statistics, no matter how advanced they are, cannot be divorced from the flesh and blood play of the games. They are never determinative of what will actually take place and can never be. These are not robots and because of that, humans playing the game will at times exceed their capabilities or fail to do so. And there are events that impact those capabilities; the weather, illness, and injury and some times just sheer will.

I don't want to dismiss advanced statistical analysis. They're great tools, but they cannot complete replace the simple fact that people have been figuring this game out a long time without such tools. Does that diminish their ability to help make decisions? Of course not, but as you aptly note, scouts and managers and other players all can grasp what needs to be done to put forward a victorious effort.

lollipopcurve
03-11-2010, 11:07 AM
The ideal is a blend because it should be expected to produce the largest amount of the highest quality data...

Is it really the statheads who are bristling at that argument?

Yeah, a blend is good. We know that.

I don't know who's bristling, but I didn't know we were in a contest to see who's doing it less.

My point, though, is that an important advantage of trusting your scouts is that you can act more quickly. This is a point that's rarely been made, or conceded, so far as I know. Trade markets for individual players develop quickly, within larger markets that come around twice a year. The time to act is not always the time at which the performance data attending players is conclusive. For example, in the case of defensive metrics that require 3 years before you know anything conclusive about a player's "true" abilities, would you not be putting your organization at a competitive disadvantage by insisting that no player be acquired unless he is deemed acceptable per that metric? Now, I have no idea if organizations act that way, but you get the idea.

Certainly, the stat-centric approach can make certain claims to being "right" -- less speculative -- more so than an eyes based approach. However, the eyes-based approach -- or eyes-based decisions -- can claim some advantages as well. And I'm just pointing out one of them.

jojo
03-11-2010, 11:52 AM
Yeah, a blend is good. We know that.

Then I guess this is finally a solved issue on redszone.

RedsManRick
03-11-2010, 12:23 PM
Yeah, a blend is good. We know that.

I don't know who's bristling, but I didn't know we were in a contest to see who's doing it less.

My point, though, is that an important advantage of trusting your scouts is that you can act more quickly. This is a point that's rarely been made, or conceded, so far as I know. Trade markets for individual players develop quickly, within larger markets that come around twice a year. The time to act is not always the time at which the performance data attending players is conclusive. For example, in the case of defensive metrics that require 3 years before you know anything conclusive about a player's "true" abilities, would you not be putting your organization at a competitive disadvantage by insisting that no player be acquired unless he is deemed acceptable per that metric? Now, I have no idea if organizations act that way, but you get the idea.

Certainly, the stat-centric approach can make certain claims to being "right" -- less speculative -- more so than an eyes based approach. However, the eyes-based approach -- or eyes-based decisions -- can claim some advantages as well. And I'm just pointing out one of them.

It's one thing to knowingly make a decision based on limited information due to the necessity of the circumstances. It's another to make a decision with the incorrect belief that you know as much as can be reasonably known because you've discounted good information.

Yes, there's always a point at which you have to act on the available information. But I think the primary point being made here is that there should be a recognition about the quality of the information you have when you make the decision.

The question becomes, what's the balance between speed and accuracy? I would argue that there's usually more to lose by making a rash, uninformed decision than there is by missing a window of opportunity while waiting on more information. There are very few, if any, unique opportunities in the sport of baseball where the risk calculus shifts.

When you look at the organizations who fail repeatedly, it's not for a lack of effort. It's for a lack of well informed decisions. It's hard to make the argument that quick decisions are crucial when you've suffered through a decade or more of losing by making them. Slow and steady (often) wins the race.

IslandRed
03-11-2010, 01:18 PM
Over at Hardball Times, Joshua Fisher posted a great article about sabermetrics and the casual fan. In short, he makes the case that sabermetricians have failed to communicate the basic fundamentals of why the sabermetric approach is the right one when it comes to quantitative analysis (and make no mistake, citing a pitcher's wins or a batters batting average is quantitative analysis).

Going back to the original post, I think Fisher's missing an obvious point -- these are casual fans we're talking about. They don't study baseball, they watch it. Most of them haven't rejected sabermetrics so much as they're unaware of it; but mention the words "quantitative analysis" and you've already lost them. The casual Reds fan watches the Reds, listens to the Reds, is happy when they win, sad when they lose. They go to the ballpark and have a good time. They grumble when a Red strikes out with the bases loaded. They grumble when a Red kicks a ground ball. They boo when Dusty leaves a guy in to give up a three-run homer. They'll have some "how about those Reds" and "such-and-so is a bum" and "that kid looks great" discussions with their buddies, but no serious arguments, they're just talkin' baseball. They don't play fantasy baseball. They don't spend their time secretly plotting how they'd run the team if given the chance.

In that context, sabermetrics really has nothing for those people that's going to enhance their fan-dom enough to make it worth the trouble to learn about it. It's not a communication problem, it's a relevance problem. Or, put another way, "the truth" has less practical application than you'd think.

And I'm not sure why the sabermetric community even gives a tinker's flip if the casual fan buys into what they know. Advanced stats, the balance between stats versus scouts, all that stuff we talk about here all the time -- yeah, it's important that the people running the ballclub know about it, and it's fun for serious fans like us to argue about it, but it's not important that the guy sitting next to me at the ballpark knows or cares. His money spends the same as mine.

lollipopcurve
03-11-2010, 01:28 PM
I would argue that there's usually more to lose by making a rash, uninformed decision than there is by missing a window of opportunity while waiting on more information.

Disagree. I don't think I'm advocating that quick decisions are always best -- only that circumstances in baseball do often invite, and demand, that decisions be made by relying on informed eye-based information at the expense of having the sample sizes your analytics department would trust. And that's not necessarily a recipe for bad decision-making. As noted, amateur scouting is one such circumstance -- you want sample sizes, you're confined to college players, and no one these days would say that's a good place to be. Take Chapman. You go by his numbers only, you don't sign him in December 2009. Take trade markets. They come and go, and the players at play in those markets tend not to revisit them. If you don't have the prescribed sample size of data on those players, does that mean you sit out the market? Seems like a recipe for stasis to me.

RedsManRick
03-11-2010, 03:57 PM
Disagree. I don't think I'm advocating that quick decisions are always best -- only that circumstances in baseball do often invite, and demand, that decisions be made by relying on informed eye-based information at the expense of having the sample sizes your analytics department would trust.

On a game-by-game basis, I agree. A player's ability on a given day is not necessarily the same as his "true talent". But we should not substitute the scout's small sample perspective from the knowledge we've accumulated about that player or situation in general. You don't pinch hit for a good hitter who's 0-4 with a pinch hitter who got a hit the night before but who is a mediocre hitter overall. In that case, the small sample is just noise.

If you can truly identify a change from expectation, by all means exploit it. But it can be really hard to tell the difference between the signal and the noise. It certainly strokes the ego to think you've seen the signal though... How do you go about telling the difference?



And that's not necessarily a recipe for bad decision-making. As noted, amateur scouting is one such circumstance -- you want sample sizes, you're confined to college players, and no one these days would say that's a good place to be. Take Chapman. You go by his numbers only, you don't sign him in December 2009. Take trade markets. They come and go, and the players at play in those markets tend not to revisit them. If you don't have the prescribed sample size of data on those players, does that mean you sit out the market? Seems like a recipe for stasis to me.

Trade markets? Still applies in my book. I don't trade away a known commodity for an unknown one unless the rough expected value of that unknown one is significantly greater. There's a reason you don't see prospect for prospect trades very often.

Chapman is an exception that proves the rule. One of the differences here compared to other scenarios is that NOBODY had much information. The Reds were not at an informational disadvantage. So the question simply became about money. And $30M for a potential Cy Young who could provide multiples of that in surplus value.

I don't deny that there may be some circumstances where the snap decision needs to be made. But I think for most of those decisions, it's an issue of adjusting from a baseline of what the math tells you based on your special knowledge of the specific situation. The trick is starting with a well-informed baseline, not just using your gut and old axioms that have since been debunked.

bucksfan2
03-11-2010, 04:13 PM
When you look at the organizations who fail repeatedly, it's not for a lack of effort. It's for a lack of well informed decisions. It's hard to make the argument that quick decisions are crucial when you've suffered through a decade or more of losing by making them. Slow and steady (often) wins the race.

I just can't buy into this. Its has more to do with money than anything else. The Yankees made 3 snap decision signings last year and ended up getting the top 3 FA pitchers on the market. The Red Sox acted quickly this off season in signing the top FA pitcher on the market and the Phillies acted quickly trading for arguably the best pitcher in baseball.

westofyou
03-11-2010, 04:19 PM
I just can't buy into this. Its has more to do with money than anything else. The Yankees made 3 snap decision signings last year and ended up getting the top 3 FA pitchers on the market. The Red Sox acted quickly this off season in signing the top FA pitcher on the market and the Phillies acted quickly trading for arguably the best pitcher in baseball.

The Cubs have been bleeding money for decades, what's their excuse?

Resources demand exaimination, and it helps if that's done with a semblance of a plan, and the Cubs struggles for years was mainly because they couldn't stick to one roadmap.

bucksfan2
03-11-2010, 04:21 PM
The Cubs have been bleeding money for decades, what's their excuse?

Stupid Billy Goat!!!!

RedsManRick
03-11-2010, 04:24 PM
I just can't buy into this. Its has more to do with money than anything else. The Yankees made 3 snap decision signings last year and ended up getting the top 3 FA pitchers on the market. The Red Sox acted quickly this off season in signing the top FA pitcher on the market and the Phillies acted quickly trading for arguably the best pitcher in baseball.

Money is absolutely a factor, but only a significant one at the extremes. Making smart decisions with the money you have is without a doubt biggest factor. You can fail spending $120M and you win spending $50M.

I'm all for increased revenue sharing and/or some sort of a cap system. But that won't help the Royals or Astros. As for what Cashman and Epstein would do given less money, that's an open question. But at the end of the day, it's about finding a way with your current resources to put the best 25 men on the field. And when you have teams paying sub-replacement talent millions of dollars, it's clear there's room for improvement.

Ron Madden
03-12-2010, 02:44 AM
Money is absolutely a factor, but only a significant one at the extremes. Making smart decisions with the money you have is without a doubt biggest factor. You can fail spending $120M and you win spending $50M.

I'm all for increased revenue sharing and/or some sort of a cap system. But that won't help the Royals or Astros. As for what Cashman and Epstein would do given less money, that's an open question. [B]But at the end of the day, it's about finding a way with your current resources to put the best 25 men on the field. And when you have teams paying sub-replacement talent millions of dollars, it's clear there's room for improvement.

Agreed. The question isn't how much money you spend but how you spend it.

bucksfan2
03-12-2010, 09:42 AM
Money is absolutely a factor, but only a significant one at the extremes. Making smart decisions with the money you have is without a doubt biggest factor. You can fail spending $120M and you win spending $50M.

I'm all for increased revenue sharing and/or some sort of a cap system. But that won't help the Royals or Astros. As for what Cashman and Epstein would do given less money, that's an open question. But at the end of the day, it's about finding a way with your current resources to put the best 25 men on the field. And when you have teams paying sub-replacement talent millions of dollars, it's clear there's room for improvement.

IMO money is the most important factor in baseball.

The Dodgers who signed Juan Pierre to an awful contract were able to keep him on as a 4th OF and then hand Manny a $20+ million contract.

The Angles who signed Garry Matthews to an awful contract were able to go out the next year and sign the best CF, Torii Hunter, on the market to a long term deal.

The Red Sox who are able to back load their pitching rotation with the likes of Penny and Smoltz as insurance for $5M and $5.5M respectively.

Even the so called smart teams have some serious questions if you ask me.

The A's consider the moneyball team can attribute a large part of their success to PED's.

The Mariners, considered the next wave of moneyball, were barely a .500 team with a $100M+ payroll

The Rays, largely considered a success story, developed a team by drafting in the top 10 year after year.

IMO the Twins have been the only successful team that has built an organization without the help of a high payroll. Now they face a dilemma in Mauer and it will be very interesting to see how it pans out.

Don't get me wrong, there is no amount of money that could make up for a poorly run organization, but the smaller teams have less of a margin for success. You pointed out the Royals, but if money weren't an issue an outfield of Dye, Beltran and Damon, all in their prime, would rival any OF in baseball.

jojo
03-21-2010, 11:18 PM
The real problem dogging the stats community at the moment is that many of the leading statheads are being gobbled up by major league FO’s or by proprietary sites like BP that make fans pay to play.

ERA is the average number of arbitrarily assigned “earned” runs a team gives up per 9 innings that a pitcher is on the mound. As such, ERA combines several poison pills. It includes a significant amount of information that has little to do with how the pitcher actually pitched and it is predictive of nothing. BTW, the number of earned runs that a pitcher “gives up” over a given number of IP can swing dramatically from year to year based upon swings in BABIP, LOB%, HR/FB%-all things that a pitcher can’t control. One needs only to look at Bronson Arroyo to see how such swings in unrepeatable factors like BABIP (a pitcher has only marginal control over the fate of a batted ball) can confuse ERA or it's equally flawed cousin ERA+. But ERA isn’t going to tell you that directly-in fact it inappropriately treats these parameters as if they are part of the pitcher’s true skillset. In other words it’s entirely unclear what ERA is telling you concerning the effectiveness of a pitcher.

ERA is like performing brain surgery with a chainsaw when it comes to telling us the contribution a pitcher made to run prevention. It’s puzzling why one would default to ERA when sabermetric scalpels are readily available. For instance, FIP is a fairly blunt tool that views pitcher performance through only things he can control (i.e. his peripherals). Thus FIP represents an easily calculated glimpse at a pitcher’s true influence on run prevention that is much more informative than ERA. By discarding the superfluous and confounding information that ERA inappropriately embraces, FIP provides a straightforward tool for assessing performance. It has blind spots but they are easily accounted for by also considering a pitchers HR/FB%, LOB%, BABIP etc. FIP represents an easy to understand alternative to being stuck with the fatal limitations imposed by relying on a murky, single stat like ERA.

There will be those who point out that FIP doesn’t predict a pitcher’s future performance perfectly and they will be right because there is a lot of randomness that can affect outcomes. Again, we know it’s blind spots and have spent many years correcting them and there are now several more detailed versions that account for luck metrics and batted ball tendencies that one can use if they so desire. One can even compare park and league adjusted FIP with an unadjusted FIP if they so choose. Soon pitch f/x and hit f/x data will allow these metrics to be fine tuned even more.

The fact is that just plain, simple old FIP allows one to unravel the complexities imposed by the realization that pitchers rarely are the ideal combination of missing bats/excellent control/worm killers. FIP not only is useful for looking forward because a pitcher’s peripherals capture his true skill level and estimating his true skill level is the best way to project his likely future performance, FIP also captures past performance very well. This is something that is impossible with ERA because ERA just throws its hands up in the air and attributes every single influence on run prevention to the pitcher. Thus ERA actually obscures the ability to discern how a particular pitcher might represent a composite compromise of the pitching trifecta because ERA hopelessly confuses the true contribution of the pitcher.

If one cares about better understanding a pitcher’s true responsibility for the runs scored on his watch, then they will gravitate toward a metric that ignores the influence of things the pitcher cannot and should not be held accountable for….

M2
03-22-2010, 10:41 AM
jojo,

For the next season, would you rather have a pitcher with a 4.00 ERA and a 4.50 FIP or a pitcher with a 4.50 ERA and 4.00 FIP?

Simple question. One you'll surely dodge.

BTW, ERA doesn't purport to describe a pitcher's skillset and you know that. Merely it tracks his run prevention performance over a slice of time. It tells you what happened. And if you want to count UEs you're free to do so. No one does because no one's quite sure how much they should be assigned to the pitcher and because they mostly come out in the wash.

So ERA's not like performing brain surgery with a chainsaw. It's like cutting down a tree with a chainsaw. It's a crude element that does a crude job. Other tools will be needed if you want to turn that tree into paper or furniture, but something had to chop down the tree in the first place.

nate
03-22-2010, 10:45 AM
The real problem dogging the stats community at the moment is that many of the leading statheads are being gobbled up by major league FO’s or by proprietary sites like BP that make fans pay to play.

ERA is the average number of arbitrarily assigned “earned” runs a team gives up per 9 innings that a pitcher is on the mound. As such, ERA combines several poison pills. It includes a significant amount of information that has little to do with how the pitcher actually pitched and it is predictive of nothing. BTW, the number of earned runs that a pitcher “gives up” over a given number of IP can swing dramatically from year to year based upon swings in BABIP, LOB%, HR/FB%-all things that a pitcher can’t control. One needs only to look at Bronson Arroyo to see how such swings in unrepeatable factors like BABIP (a pitcher has only marginal control over the fate of a batted ball) can confuse ERA or it's equally flawed cousin ERA+. But ERA isn’t going to tell you that directly-in fact it inappropriately treats these parameters as if they are part of the pitcher’s true skillset. In other words it’s entirely unclear what ERA is telling you concerning the effectiveness of a pitcher.

ERA is like performing brain surgery with a chainsaw when it comes to telling us the contribution a pitcher made to run prevention. It’s puzzling why one would default to ERA when sabermetric scalpels are readily available. For instance, FIP is a fairly blunt tool that views pitcher performance through only things he can control (i.e. his peripherals). Thus FIP represents an easily calculated glimpse at a pitcher’s true influence on run prevention that is much more informative than ERA. By discarding the superfluous and confounding information that ERA inappropriately embraces, FIP provides a straightforward tool for assessing performance. It has blind spots but they are easily accounted for by also considering a pitchers HR/FB%, LOB%, BABIP etc. FIP represents an easy to understand alternative to being stuck with the fatal limitations imposed by relying on a murky, single stat like ERA.

There will be those who point out that FIP doesn’t predict a pitcher’s future performance perfectly and they will be right because there is a lot of randomness that can affect outcomes. Again, we know it’s blind spots and have spent many years correcting them and there are now several more detailed versions that account for luck metrics and batted ball tendencies that one can use if they so desire. One can even compare park and league adjusted FIP with an unadjusted FIP if they so choose. Soon pitch f/x and hit f/x data will allow these metrics to be fine tuned even more.

The fact is that just plain, simple old FIP allows one to unravel the complexities imposed by the realization that pitchers rarely are the ideal combination of missing bats/excellent control/worm killers. FIP not only is useful for looking forward because a pitcher’s peripherals capture his true skill level and estimating his true skill level is the best way to project his likely future performance, FIP also captures past performance very well. This is something that is impossible with ERA because ERA just throws its hands up in the air and attributes every single influence on run prevention to the pitcher. Thus ERA actually obscures the ability to discern how a particular pitcher might represent a composite compromise of the pitching trifecta because ERA hopelessly confuses the true contribution of the pitcher.

If one cares about better understanding a pitcher’s true responsibility for the runs scored on his watch, then they will gravitate toward a metric that ignores the influence of things the pitcher cannot and should not be held accountable for….

Right on.

ERA is one of the most bizarre team stats when one thinks about it's measure. As you said.


ERA is the average number of arbitrarily assigned “earned” runs a team gives up per 9 innings that a pitcher is on the mound.

Verily and so. I mean, if one ponders the number of plays in a typical baseball game and averages those ended by the pitcher (as a pitcher, not a fielder,) the very BEST guys "do it themselves" what...8 - 9 times per game? That means there are, at the very least, 10 - 20 outs that must be made by the defense. But ERA tells us about a pitcher's skill...huh?

What do you think of SIERA?

jojo
03-22-2010, 10:54 AM
jojo,

For the next season, would you rather have a pitcher with a 4.00 ERA and a 4.50 FIP or a pitcher with a 4.50 ERA and 4.00 FIP?

Simple question. One you'll surely dodge.

Assuming all other things being equal, if I could make the decision using a crystal ball (i.e. knowing how the the season would play out) I'd go with the guy who would post the 4.00 ERA/4.50 FIP.

If I had to make the decision before the season (like in real life), i'd go with the guy who had the 4.00 FIP because he's likely the better pitcher and more likely to post the lower ERA going forward.


BTW, ERA doesn't purport to describe a pitcher's skillset and you know that. Merely it tracks his run prevention performance over a slice of time. It tells you what happened.

It tells you what happened at the TEAM level and basically gives the pitcher all of the credit/responsibility and you know that....

M2
03-22-2010, 11:47 AM
Assuming all other things being equal, if I could make the decision using a crystal ball (i.e. knowing how the the season would play out) I'd go with the guy who would post the 4.00 ERA/4.50 FIP.

If I had to make the decision before the season (like in real life), i'd go with the guy who had the 4.00 FIP because he's likely the better pitcher and more likely to post the lower ERA going forward.

And there you've tripped across why ERA has a use. It tells you what happened in the terms you care most about. If you want to put a little more context around it, there's ERA+.

So when someone says Bronson Arroyo has a 4.00 ERA and a 112 ERA+ in a Reds uniform, that's nothing more than a basic summation of how he did. It purports to be nothing else. No matter what anyone may have thought he would do, no matter how other indicators insist it should have worked out, that IS how it worked out on the field in the broadest, crudest terms. And a 112 ERA+ is pretty decent. Nolan Ryan was 111 for his career. That doesn't mean Bronson Arroyo and Nolan Ryan are equivalent talents. It just means that for four seasons (six if you want to count his previous two seasons in Boston), Bronson Arroyo has had a fairly tasty run.

To go back to your metaphor from two posts ago, this specifically isn't brain surgery. Something's got to be the big dumb number and at the end of the day a low ERA is a desirable outcome. A low FIP is not a desirable outcome, it is an attempt to amalgamate a desirable set of traits.

Getting to your second sentence there, you might be right, though obviously we'd want a whole lot more data than one season to make an educated guess on who to keep going forward. No sane person would base a multi-year decision on a single season of any pitching stat. In fact, no sane person should base any multi-year decision on multiple years of a single pitching stat because there's no such thing as a stat that fully sums up a pitcher's "true talent." Nothing really even comes that close.


It tells you what happened at the TEAM level and basically gives the pitcher all of the credit/responsibility and you know that....

Sure, it tells us what happened at a macro level when that pitcher took the mound. The guy starts every play that takes place while he's on the mound, so of course we care about the macro level performance while he's out there. Teams are sending pitchers to the mound specifically to influence that macro level performance.

So start at the intersection between the macro and the pitcher's time on the mound and then work your way toward the more specific dependent on the detail needed to answer the question you're asking. It's not like there's one number out there to rule them all. Plus, I keep thinking to myself that people can emerge from these imaginary bunkers they keep digging for themselves and approach numbers in a sensible, non-emotional manner.

jojo
03-22-2010, 12:50 PM
And there you've tripped across why ERA has a use. It tells you what happened in the terms you care most about. If you want to put a little more context around it, there's ERA+.

So when someone says Bronson Arroyo has a 4.00 ERA and a 112 ERA+ in a Reds uniform, that's nothing more than a basic summation of how he did.

That's a basic summation of how the Reds did. It's a subtle but very significant difference. Aside from that, I agree with the following:


It purports to be nothing else. No matter what anyone may have thought he would do, no matter how other indicators insist it should have worked out, that IS how it worked out on the field in the broadest, crudest terms.

But that said, it's also a slippery slope when determining the credit that should be attributed to the pitcher.. Often those who use ERA don't draw the distinction.


And a 112 ERA+ is pretty decent. Nolan Ryan was 111 for his career. That doesn't mean Bronson Arroyo and Nolan Ryan are equivalent talents. It just means that for four seasons (six if you want to count his previous two seasons in Boston), Bronson Arroyo has had a fairly tasty run.

Given the rather large swings in ERA he's had as a Red, that 112+ ERA comes with SD of +/- 20... In other words, it's not really different from 100. ERA+ isn't really shedding light on the issue IMHO.


To go back to your metaphor from two posts ago, this specifically isn't brain surgery. Something's got to be the big dumb number and at the end of the day a low ERA is a desirable outcome. A low FIP is not a desirable outcome, it is an attempt to amalgamate a desirable set of traits.

Here's the main difference between the way we approach this issue, I think. To me the whole purpose of sabermetrics is to determine the true skill level of the player as accurately as possible. That is the desirable outcome because it provides the most useful information IMO.


Getting to your second sentence there, you might be right, though obviously we'd want a whole lot more data than one season to make an educated guess on who to keep going forward. No sane person would base a multi-year decision on a single season of any pitching stat. In fact, no sane person should base any multi-year decision on multiple years of a single pitching stat because there's no such thing as a stat that fully sums up a pitcher's "true talent." Nothing really even comes that close.

We have always agreed on these points. My position has always been that the approach to viewing a pitcher is through a prism of components such as his peripherals and "luck metrics" etc. The ability to assess peripherals is getting even more informative as we can also consider velocity and break, contact rates, average run values of the pitches in his arsenal etc to understand how a pitcher's stuff should translate. In other words, there are stats that are even beginning to allow slivers of the scout's eye to become quantitative. Given this prism, ERA adds very little IMO.


It's not like there's one number out there to rule them all. Plus, I keep thinking to myself that people can emerge from these imaginary bunkers they keep digging for themselves and approach numbers in a sensible, non-emotional manner.

I absolutely agree.