PDA

View Full Version : Stats 101: What's fact and what's fiction?



westofyou
04-07-2007, 10:58 AM
http://msn.foxsports.com/mlb/story/6641066



The best-selling book Moneyball opened the eyes of many fans to the statistical revolution in baseball.

This revolution actually predates Moneyball by several decades, but it's become more of a phenomenon of late, thanks in part to the popularity of Michael Lewis' book. It's engendered (silly and petty) hostilities between the stats crowd and the traditionalists, and it's led to a great deal of mutual misunderstanding.

Still, we're certainly not here to rehash old arguments. What we are here to do is provide an introduction to the statistical movement that's now an indelible part of the game. No, we're not trying to turn you into a "stat geek," but we are trying to show that the new generation of baseball statistics is nothing to be afraid of or put off by. They're just another way to enjoy this great game and enrich your understanding of it.

So to get the ball rolling on this series we call "Stats 101," we'll take a look at five principles vital to understanding the game through a statistical lens. Let's get started ...

1. Context is everything

If there's a single thing to understand, it's probably this. Context takes many forms — the park a player toils in, his league, his era, his spot in the lineup, and the quality of his opposition, to name only a few.

Most fans grasp that Coors Field benefits hitters and that Petco Park benefits pitchers, but knowing that really isn't enough. Shea Stadium, for instance, is much, much tougher on right-handed power hitters (like David Wright) than on left-handed power hitters (like Carlos Delgado). So it's not sufficient to say a park merely helps or hurts the offense or the pitcher. We need to know who it's helping or hurting and to what extent.

There's also the matter of era. A run scored in, say, the year 2000 meant less than a run scored in 1968 (a.k.a., "The Year of the Pitcher"). That's because runs in 2000 were much easier to come by. You had many more hitter-friendly parks in the league and you had a strike zone that squarely benefited the hitter. In 1968, the pitcher's mound was higher (giving the hurler a serious advantage), the strike zone went from the knees to the bottom of the shoulders, and there was no DH in the AL.

In other words, it was a time in which the pitcher worked with a measurable advantage. Heck, Carl Yastrzemski won the AL batting title that year with a .301 average. In 2000, Carlos Lee also hit .301, but that mark ranked only 22nd in the AL (Nomar Garciaparra claimed the batting title that year with a .372 average). So, particularly when comparing players from different eras, taking the run-scoring environment into account it essential.

In this, the day of the unbalanced schedule, it's also important to take strength of opposition into consideration. For instance, those denizens of the AL Central this season will be facing much hardier opponents than, say, those in the NL Central. With the unbalanced schedule, in which teams play most of their games against intra-divisional opponents, you have some serious differences in terms of strength of schedule.

As for the differences in league, they have serious bearing on the minor leagues. The High-A Pacific Coast League, for instance, is a great circuit for hitters, while the High-A Carolina League is fairly hostile toward the offense. So it's especially important not to take minor league numbers at face value. What kind of league did he play in? What kind of park did he play in? Was he older or younger than his peer group? These are all vital pieces of information for assessing a prospect.

2. Beware the small sample size

Here's an intuitive one that most fans understand. Every fan knows that you don't take a week's worth of games and use it to make grand and sweeping pronouncements. Through one game, for instance, Adam Dunn is on pace to hit 324 home runs this season. That's not going to happen because, as we all know, time has a way of bringing outrageous numbers and paces to heel.

Sometimes, however, we don't take this principle far enough. It's not uncommon for players to put up fluke-ish numbers over the course of an entire season. Take a look at Norm Cash back in 1961. Or Brady Anderson in 1996. Or Rich Aurilia in 2001. Or Adrian Beltre in 2004. There are plenty of examples of players who drastically out-performed expectations and then never came close to doing so again. That's because they had fluke seasons.

As well, when looking at, for example, how a hitter fares against left-handed pitching, even multiple seasons of data may not be enough to allow us to draw firm conclusions. A good rule of thumb is that if a hitter or pitcher has been exhibiting a statistical trend for at least three seasons, then you're probably looking at a genuine, repeatable skill. Otherwise, be skeptical.

3. Beware of (some) traditional stats

We'll got into this in further depth in the coming weeks, but it's important to recognize that many of the stats you're accustomed to seeing on television broadcasts and on the backs of Topps cards aren't especially meaningful. For hitters, batting average, RBI and runs scored, for instance, aren't particularly illuminating. For pitchers, ERA and (especially) wins and losses are all highly flawed. On defense, the same goes for errors and fielding percentage. As mentioned, we'll tackle these specific problems in later editions, and we'll also point out some alternatives to traditional stats.

4. Use your resources

The Internet age is a great time to be a baseball fan, and it's a great time to delve into the ever-evolving world of statistical analysis. There's a tremendous amount of resources out there, and many of them are free.

The FOXSports.com stats page is a great place to start. If you're looking for more detailed historical information or splits (i.e., numbers versus lefties or righties, home-road stats, first half/second half splits), then Baseball-Reference.com is hard to beat. Minor league numbers? Pay a visit to TheBaseballCube.com or MinorLeagueSplits.com. Looking for more advanced metrics? Pay a visit to the stats pages at Baseball Prospectus (subscription only) and The Hardball Times. Want customized searches and queries? Try the MLB stats search engine at Enth.com. Want more sortable goodness? Give Baseball Direct a try.

5. Know that numbers aren't everything

Stats are nothing to be feared or scorned, but they certainly don't provide a complete picture of the game. Scouting information is also indispensable. Is a minor-league hurler dominating despite the lack of a reliable third pitch? We need scouts and the powers of observation for that. Is a hitting prospect putting up power numbers with an uppercut swing that will be exploited at the highest level? We need scouts and the powers of observation for that, too. Ditto for evaluating a player's defensive skills and a pitcher's mechanics.

Stats and the sensible analysis of them enrich the game, but they certainly don't make the game. It's important to be mindful of that always.

Next time out, we'll take a look at what's wrong with some of the offensive stats to which you've become accustomed.

Dayn Perry is a frequent contributor to FOXSports.com and author of the new book, "Winners: How Good Baseball Teams Become Great Ones" (Available now at Amazon.com).

Always Red
04-07-2007, 12:18 PM
Stats and the sensible analysis of them enrich the game, but they certainly don't make the game. It's important to be mindful of that always.


Very interesting article. Understanding the "modern" stats have certainly increased my enjoyment of the game in many ways. But, the beauty and the pace of the game is something that just cannot be captured via any statistics though, and never will be able to be quantified. That part of the game is something many of us here enjoy very, very much!

I try to keep something very simple in mind when looking at team statistics, especially. Runs scored, and runs given up really do tell you more about team play than anything else, as far as wins and losses go. It sounds simplistic, but the best offense is the one who scores the most runs in the end (at the end of the season), no matter how they did it. Likewise, the best "defense" (which includes pitching and glovework) is the one who gives up the least.

I know- "duh!" But sometimes when I get in over my head with looking at all the different stats available (and they are legion!), sometimes I need to retreat and just keep it really simple. :laugh:

Johnny Footstool
04-07-2007, 06:38 PM
Baseball is like magic. You can enjoy it by trying to figure out how each trick works, and you can enjoy it simply for the performance.

flyer85
04-07-2007, 06:57 PM
Baseball is like magic. "see dee ball, hit dee ball". :D

Redsland
04-07-2007, 07:09 PM
Baseball is like magic.
And Dick Pole is the wand.

Baseball is like magic.
And Bronson Arroyo is the beautiful assistant.

Baseball is like magic.
And our uniforms finally have sleeves for nothing to be up.

:)

RedsManRick
04-07-2007, 07:11 PM
Not a bad primer. The one point I always make to my non-stat friends is this:

Stats are just a measurement of something that happened. The key is understanding which stats tell us about what is likely to happen in the future, and which tell us only about the past. Some stats only tell us what a guy did. Others can give us some real insight in to what a guy is likely to do in the future. The difference is often hard to discern.

For example:
FACT: Freddy Sanchez had the best batting average in the National League in 2006.
FICTION: Freddy Sanchez is quite likely to be among the league leaders in batting in 2007.

In reality, the things which most accurately predict future batting average suggest Freddy Sanchez will hit around .300 in 2007. Good, but a big step down from his .344 2006 BA.

The problem with the "traditional" stats is that, by and large, they are very good at recording some very specific events which occurred, and not so good at predicting what is likely to happen in the future or providing a full and balanced valuation of a player's contributions to his team's success (or failure).

If we could just get everybody to agree on this simple idea, we'd be making some real progress.

Redsland
04-07-2007, 07:26 PM
The first big lesson for me was the first one listed above: context. Making sure comparisons were apples to apples. 25-year-olds to 25-year-olds. Catchers to catchers.

Fail to do that, and you're going to make a lot of faulty assumptions.

dabvu2498
04-07-2007, 07:27 PM
And Dick Pole is the wand.

And Bronson Arroyo is the beautiful assistant.

And our uniforms finally have sleeves for nothing to be up.

:)

Best ever.

redsmetz
04-08-2007, 07:49 PM
Not a bad primer. The one point I always make to my non-stat friends is this:

Stats are just a measurement of something that happened. The key is understanding which stats tell us about what is likely to happen in the future, and which tell us only about the past. Some stats only tell us what a guy did. Others can give us some real insight in to what a guy is likely to do in the future. The difference is often hard to discern.

For example:
FACT: Freddy Sanchez had the best batting average in the National League in 2006.
FICTION: Freddy Sanchez is quite likely to be among the league leaders in batting in 2007.

In reality, the things which most accurately predict future batting average suggest Freddy Sanchez will hit around .300 in 2007. Good, but a big step down from his .344 2006 BA.

The problem with the "traditional" stats is that, by and large, they are very good at recording some very specific events which occurred, and not so good at predicting what is likely to happen in the future or providing a full and balanced valuation of a player's contributions to his team's success (or failure).

If we could just get everybody to agree on this simple idea, we'd be making some real progress.

Not meaning to pick a fight, but why does it correlate that Sanchez will only hit .300 this year? It may well be correct, but I don't understand why it follows that he will fall off.

RedsManRick
04-08-2007, 08:25 PM
Not meaning to pick a fight, but why does it correlate that Sanchez will only hit .300 this year? It may well be correct, but I don't understand why it follows that he will fall off.

Because Sanchez hit .344 due to a ridiculously high BABIP. The types of balls he puts in to play do not usually turn in to hits at the rate at which they did for him last year. That is, the things a player does which are predictive of future batting average suggest that he is not going to hit .344 again. Could they be wrong? Of course. And Adrian Beltre could hit 45 homers.

Basically, there are certain things which are not very predictive of themselves. For example, a pitcher's K/9 ratio is a better predictor of the next seasons ERA than is his past season's ERA. ERA is not a good predictor of ERA. It's counter intuitive, but it's true. Again, it doesn't mean he can't or won't hit for that average. But he's not likely to.

mbgrayson
04-09-2007, 09:59 AM
Good points.... Sanchez hit .370 on balls hit in play. The major league average is about .300. I agree that he will almost certainly regress.

Sanchez's detailed stats are HERE... (http://www.fangraphs.com/statss.aspx?playerid=1624&position=2B)

Steve4192
04-10-2007, 11:06 AM
It sounds simplistic, but the best offense is the one who scores the most runs in the end (at the end of the season), no matter how they did it. Likewise, the best "defense" (which includes pitching and glovework) is the one who gives up the least.

For the most part, that is true.

However, to REALLY be accurate, you would have to include some measure of run distribuition, variability, and environment. The truly great offenses are the ones that put up big numbers every night, not the ones that mix in a few 20 run outbursts to offset a couple of nights of gooseeggs.

edabbs44
04-11-2007, 08:06 AM
Good points.... Sanchez hit .370 on balls hit in play. The major league average is about .300. I agree that he will almost certainly regress.

Sanchez's detailed stats are HERE... (http://www.fangraphs.com/statss.aspx?playerid=1624&position=2B)

I find it funny that, when you look at those stat projections from the "experts", they all predict a significantly higher than average BABIP for Sanchez in 2007.

Bill James: .341
CHONE: .326
Marcel: .338
ZiPS: .329

How can that be explained? How do you predict luck, if it is truly only luck?

jojo
04-11-2007, 08:42 AM
I find it funny that, when you look at those stat projections from the "experts", they all predict a significantly higher than average BABIP for Sanchez in 2007.

Bill James: .341
CHONE: .326
Marcel: .338
ZiPS: .329

How can that be explained? How do you predict luck, if it is truly only luck?

Well its not predicting luck in this context.

He is projected to have a significantly higher than average BABIP because he is also projected to have a significantly higher than average BA.

BABIP as a metric is determined by a defined formula. The projection systems predict Freddie's number of hits, HR, AB and K's.... the projected BABIP is then calculated from those projected totals. So basically if he performs to James' projection of 179 hits, 6 hrs, 562 AB and 49 K's then Freddie's BABIP should be .341.

BABIP in this instance is really just a baseline value so you have to be careful how you interpret it. As with all stats, context is important to consider. It's probably less informative to compare a hitter's BABIP to the league norm than to compare it to his own norm so comparing a projected BABIP to the league really isn't meaningful to me. In the case of a guy like Matthews Jr, it's reasonable to look at his career BABIP and conclude that his spike in performance in '06 was more a matter of being lucky than a product of his skill set.

RedsManRick
04-11-2007, 08:48 AM
Not sure where you got those from Edabbs44, but here are those projections from fangraphs.

2007 Bill James .319
2007 CHONE .299
2007 Marcel .313
2007 ZiPS .306

edabbs44
04-11-2007, 09:03 AM
Not sure where you got those from Edabbs44, but here are those projections from fangraphs.

2007 Bill James .319
2007 CHONE .299
2007 Marcel .313
2007 ZiPS .306

That's actual BA prediction...I was stating BABIP prediction.

2001MUgrad
04-11-2007, 09:07 AM
Basically, there are certain things which are not very predictive of themselves. For example, a pitcher's K/9 ratio is a better predictor of the next seasons ERA than is his past season's ERA. ERA is not a good predictor of ERA. It's counter intuitive, but it's true. Again, it doesn't mean he can't or won't hit for that average. But he's not likely to.

Help me understand how you came to this conclusion.

I was just looking at 3 Reds Pitchers ERA's from both last season and their career as a whole.

I was looking at Harang, Lohse, and Arroyo. All 3 of which have a decent sample size to look at I would imagine.

Both Lohse and Harrang's ERA's last year are very much in line with their career averages. Arroyo's was quite a bit lower than his career average and maybe part of that had to do with pitching 240 innings last year when only 1 time in his career had he made it to 200 before. Some of it could be because most NL hitters hadn't seen him before. There could be any number of reasons.

What I would take from those 3 guys ERA's and predict, guess, whatever. I would put Arroyo's ERA this year somewhere between 3.90 and 4.10. I would put Lohse's ERA somewhere in the 4.50 range. I would guess Harrang would be somewhere around 3.80 or 3.90.

Some factors you also need to consider is the fact that the player improves or declines from season to season.

Induldge me a little bit. What would their k/9 tell you about those 3 pitchers ERA that past ERA doesn't??

Not starting a fight, purely interested in the differences.

RedsManRick
04-11-2007, 10:14 AM
Help me understand how you came to this conclusion.

I was just looking at 3 Reds Pitchers ERA's

There's the first problem. In any sort of stats analysis, a sample of 3 is grossly insufficient.

Secondly, it's about year to year variation. ERA itself moves around quite a bit and is influenced by luck. K/9 is fairly constant, at least in comparison.

If I have some time, I'll try to dig up an article that gives a full explanation.

jojo
04-11-2007, 10:14 AM
Help me understand how you came to this conclusion.

I was just looking at 3 Reds Pitchers ERA's from both last season and their career as a whole.

I was looking at Harang, Lohse, and Arroyo. All 3 of which have a decent sample size to look at I would imagine.

Both Lohse and Harrang's ERA's last year are very much in line with their career averages. Arroyo's was quite a bit lower than his career average and maybe part of that had to do with pitching 240 innings last year when only 1 time in his career had he made it to 200 before. Some of it could be because most NL hitters hadn't seen him before. There could be any number of reasons.

What I would take from those 3 guys ERA's and predict, guess, whatever. I would put Arroyo's ERA this year somewhere between 3.90 and 4.10. I would put Lohse's ERA somewhere in the 4.50 range. I would guess Harrang would be somewhere around 3.80 or 3.90.

Some factors you also need to consider is the fact that the player improves or declines from season to season.

Induldge me a little bit. What would their k/9 tell you about those 3 pitchers ERA that past ERA doesn't??

Not starting a fight, purely interested in the differences.

Intuitively, the point is that K/9 is a measure of a skill specific to the pitcher and ERA depends upon a host of things that are out of the pitcher's control. This intuition is bore out when doing an analysis of ERA, K/9 etc of all pitchers. A pitcher's previous K/9 rate correlates higher to their ERA than their previous ERA correlates to their next season's ERA. Therefore k/9 has more predictive power than ERA... It is for this reason that FIP or xFIP are far superior metrics for evaluating pitchers than ERA...

2001MUgrad
04-11-2007, 10:31 AM
Intuitively, the point is that K/9 is a measure of a skill specific to the pitcher and ERA depends upon a host of things that are out of the pitcher's control. This intuition is bore out when doing an analysis of ERA, K/9 etc of all pitchers. A pitcher's previous K/9 rate correlates higher to their ERA than their previous ERA correlates to their next season's ERA. Therefore k/9 has more predictive power than ERA... It is for this reason that FIP or xFIP are far superior metrics for evaluating pitchers than ERA...

With a stat such as FIP or xFIP aren't you basically taking 5 or 6 different stats already used and combining them into 1 means of measurement?? Same as with OPS, you are taking stats used throughout the history of baseball and combining into 1 statistical unit of measure.

jojo
04-11-2007, 11:46 AM
With a stat such as FIP or xFIP aren't you basically taking 5 or 6 different stats already used and combining them into 1 means of measurement?? Same as with OPS, you are taking stats used throughout the history of baseball and combining into 1 statistical unit of measure.

Yes FIP summarizes several counting stats. I'm not sure why that would be undesirable though. ERA basically just tells you how many runs were scored while a pitcher was on the mound-there's alot of reasons why those runs may have scored that are completely unrelated to how effective the pitcher actually was. Here's (http://www.redszone.com/forums/showpost.php?p=1229305&postcount=21) a post that illustrates how misleading ERA can be (this is especially true for relievers for obvious reasons).

FIP basically evaluates a pitcher based upon the counting stats that are only influenced by himself (i.e. walk, strikeout, and home run rates). It has it's flaws-namely it assumes a pitcher can control whether a flyball is a homerun or not and it assumes a constant LOB% (neither of which are true as pitchers that deviate will regress back to the mean except of course in the case where a pitcher is toast hence these are sometimes considered luck factors). xFIP basically corrects FIP for the HR rate. The great thing about FIP is that since it is based upon things that are related to the pitcher's actual skillset, it's a much better predictor of future performance than a blunt metric like ERA which is influenced by luck, the defense behind him etc.

If you look at a pitcher's peripherals (K%, BB%, K/BB, GB%, HR/FB%, LOB%, and BABIP) in conjunction with a superior metric like FIP you really can get a very complete picture of why a pitcher performed the way he did and a really good feel for what you should expect he'll do in the future.

RedsManRick
04-11-2007, 02:38 PM
Good points JoJo. As I mentioned earlier, differentiating between those things which merely record what happened and those things which predict what is likely to happen in the future is complicated and often counterintuitive. However, it's crucial when discussing statistics to always make sure we're making logical inferences.

That particular discussion you linked is timely given the numbers White put up in Philly after we ditched him and that he's now a setup man in Houston. Particularly with relievers, given a small sample of appearances, using ERA as the measurement can lead to some pretty poor conclusions. Not that Rick White is a great pitcher. But rather than he's not much different from David Weathers -- at least as far as the numbers go.

westofyou
04-17-2007, 10:39 PM
http://msn.foxsports.com/mlb/story/6673752

Why some stats are overrated

Dayn Perry



Last time out, we took at a look some of the basics of statistical analysis as it applies to baseball. This time, the focus will be a couple of the traditional offensive statistics and what's wrong with them.

When we say "traditional offensive statistics," we're talking about two of the most familiar and widely used measures: batting average and RBI. In almost every quarter — be it game telecasts, sports radio pow-wows or barroom debates — these two stats are invoked regularly to prove or disprove the worth of hitters. That's not such a good idea. Batting Average and (especially) RBI don't adequately assess whether a hitter is doing his job, at least by themselves. Let's explore why that's the case ...

Batting Average (AVG) is one of the hoariest of offensive metrics. It's useful, and it's certainly worth considering when evaluating a player. By itself, however, AVG isn't usually good enough. Why is this so? Think about what AVG tells you: how often a hitter reaches via a hit. It doesn't tell you what kind of hit (single? double? triple? home run?), and it doesn't tell you how often a hitter reaches base by other means (e.g., a walk, a hit-by-pitch). Those are necessary pieces of information. If a hitter has an AVG of .325, then you can be pretty sure he's doing his job. But what about a .275 hitter? If he has little raw power and doesn't take walks, then he's probably a stiff. Of course, a .275 hitter can also be an MVP candidate. Carlos Beltran, for instance, hit .275 last season, but he also drew 95 walks and mashed 80 extra-base hits.

As well, AVG is what we call a rate stat. That is, it's a percentage measure, and, like all rate stats, it has no built-in indicator of playing time. If a hitter is batting .333, he might be 1-for-3 on Opening Day, or he might be 147-for-442, as Brian McCann was in 2006. So when looking at AVG or any other rate stat, it's wise to have some idea of how many plate appearances are involved.

In summary, AVG is good to know but generally sub-optimal when not in the company of On-Base Percentage (OBP), which takes into account walks and HBPs, and Slugging Percentage (SLG), which takes extra-base hits into account, and some indication of playing time. Many times in this space you'll see hitters' "triple-slash" stats. This is their AVG/OBP/SLG (presented in that slash format), and they're much more illuminating than just AVG.

If there's any offensive stat that should be uniformly ignored, then it's our old friend RBI. This deeply flawed measure has led to innumerable bad contracts, MVP-balloting mistakes and even character judgments passed on hitters far and wide.

Ruben Sierra may have put up the worst 100-RBI season ever with Oakland back in 1993. (Otto Gruele Jr / Getty Images)

The major problem with RBI is that it's highly, highly context dependent. It's as much a reflection of batting order, playing environment and the quality of surrounding hitters as it is an indication of skill on the part of the batter. So when you eyeball a hitter's RBI total, you're really not learning much.

Sure, it's useful at the margins. That is, it's difficult to rack up 130 RBI in a season and somehow be a lousy hitter. On the other hand, it is possible to reach the vaunted 100-RBI mark and still be a liability. Let's take, for example, Ruben Sierra in 1993. That year he tallied 101 RBI, but he did so while posting an awful batting line of .233 AVG/.288 OBP/.390 SLG. Sierra may have notched a triple-digit ribbie total, but he was hurting his team all the while. If you put a player in the middle of a lineup, behind some potent OBP threats and have him play roughly a full season, it's difficult not to reach the 100-RBI mark, at least in the current run-scoring environment.

Accordingly, it's not wise to compare players based on their RBI totals. Last season, for example, Ryan Howard led the senior circuit in RBI with 149. However, it's also worth noting that Howard, among NL qualifiers, ranked only 15th in terms of the percentage of base runners driven in. Howard's a great hitter, but he led the league in RBI mostly because he led the league in RBI opportunities. Considering that the Phillies number-two and number-three hitters last season had OBPs of .396 and .395, respectively, it's not surprising Howard had so many chances to plate runners.

As well, there's a (understandable) tendency in the minds of many fans to equate an RBI with the production of a run. Unless it's a home run, this simply isn't the case. The man who got on base in front of the man who drove him in also owns a piece of that run on the scoreboard. Many of us tend to forget that. There's reason to believe that hitters have varying skills when it comes to driving runners in, but RBI don't measure that ability. Wash your hands of it.

So with the limitations of AVG laid out and RBI thoroughly assailed, we'll move along. Next time we'll take a gander at some of the more advanced offensive statistics and how they provide you with better information.

jojo
04-17-2007, 10:41 PM
Dayn Perry... you big recycler you.... :cool:

westofyou
04-26-2007, 09:47 AM
http://msn.foxsports.com/mlb/story/6700964

Stats 101: Give VORP a chance


Last week in Stats 101, we examined some of the problems with traditional offensive statistics.

This week, we're here to build a better mousetrap. That means delving into the world of emergent statistical analysis. Some of the new-fangled stats you encounter — those with the scary-sounding acronyms — can be a little off-putting. They're hard to understand, and it's not entirely clear what they're measuring or how they're doing it.

Mostly, this is a problem of presentation. You don't need a slide rule, abacus or surpassing mastery of calculus to gain a working knowledge of these metrics; you just need a little patience.

Since we just finished bagging on traditional offensive stats, we may as well focus on how to improve upon them. As previously indicated, you can get a pretty good idea of how a player is faring at the plate by looking at his AVG/OBP/SLG "triple-slash" stats in tandem with some indication of playing time and on-the-fly adjustments for playing environment.

However, if you want to pry more deeply into how productive a player is, then a Baseball Prospectus stat called "Value Over Replacement Player" (VORP) is a great tool.

VORP measures, in runs, what a hitter is contributing over and above what his team could get from a "replacement" player. In this context, replacement has nothing to do with labor stoppages, strikebreakers or scabs; rather, it refers to the level of talent a team can dig up on the fly. Think of it this way: what happens when a team sustains an injury to one of its regular players? Barring a trade for someone of equal caliber, they'll plug in a bench player, call up a "not quite ready" prospect or minor-league veteran or maybe pick up a plug-in guy from the waiver wire. That's the baseline we're talking about. (Generally speaking, a replacement-level player provides around 80 percent of what a league-average player provides.)

So VORP, then, measures how much better a player is at the plate than the "break glass in case of emergency"-type player whom we just discussed.

There's another important thing to understand about VORP — it's all about position. When you look at, for instance, a player's AVG/OBP/SLG, you're not taking into account which position he's playing.

First basemen have much higher offensive standards than shortstops or catchers. That's because a first baseman's top priority is knocking the stuffing out of the ball, while a shortstop's top priority, in most cases, is fielding his position. To put a finer point on this, let's take a look at the cumulative batting line for each position from the 2000 season through the present:


Position AVG/OBP/SLG
Pitcher .143/.177/.186
Catcher .259/.322/.401
First Baseman .278/.361/.477
Second Baseman .273/.335/.406
Third Baseman .268/.336/.439
Shortstop .269/.326/.401
Left Field .277/.353/.464
Center Field .270/.335/.430
Right Field .276/.350/.463
Designated Hitter .263/.346/.448


As you can see from the above numbers, positions like first base and the flank outfield spots place a premium on offense. Those premium, up-the-middle positions, however, don't emphasize the bat to such a degree.

This undeniable fact of the game is reflected in VORP. When you see a player's VORP listed, you're seeing how many runs he creates beyond what a theoretical "replacement" player would provide at that specific position. So a first baseman is working from a higher VORP baseline than a catcher is. This is how the game works: it's easier to dig up a first baseman with, say, 25-homer power than it is to find a catcher who can put up those numbers and provide major-league-caliber defensive skills at such a demanding position. That's one of things that makes VORP so much better than the traditional stats that aren't adjusted to reflect position.

VORP also accounts for differences in league environment and ballpark environment. As covered previously, those are two elements that can't be ignored when making serious evaluations of players. VORP doesn't measure a player's defense, but it does account for what he does at the plate and his base-stealing abilities. As well, since VORP is a cumulative statistic (e.g., the more you play, the more it affects your total VORP), it has a built-in mechanism for playing time, which distinguishes it from the rate stats we've covered previously.

Now let's take a look at the top 10 VORPs in baseball from 2006:


Rank/Player VORP
1. Albert Pujols, 1B 85.4
2. Ryan Howard, 1B 81.5
3. Derek Jeter, SS 80.5
4. Travis Hafner, DH 79.7
5. Miguel Cabrera, 3B 78.7
6. David Ortiz, DH 76.8
7. Lance Berkman, 1B 70.1
8. Grady Sizemore, CF 69.1
9. Carlos Beltran, CF 68.5
10. Joe Mauer, C 66.9


So even though Pujols and Howard, as first basemen, worked from a higher VORP baseline, they still wound up as the most productive hitters in the game last season, each producing more than 80 runs above over replacement level. Hafner certainly had better raw numbers than Jeter, but because Jeter is a shortstop he wound up with a better VORP. What about AL MVP Justin Morneau? He ranked only 26th in MLB in VORP last season.

Overall, there's nothing wrong with sticking to traditional measures, provided you make sensible adjustments to them and know which ones are useful. However, if you want a more sophisticated grasp of offensive performance, then give VORP a try. It's not as scary as it sounds.

westofyou
04-26-2007, 09:48 AM
http://msn.foxsports.com/mlb/story/6740056

Stats 101: Don't trust the W

Dayn Perry



In the previous edition of Stats 101, we took a look at how to improve upon traditional offensive statistics, and this time we'll shift our focus to the run-prevention side of things. To begin exploring the world of pitching and defense, we'll first take a look at what's wrong with the statistics we usually depend upon to measure prowess on the mound or with the glove.

Let's begin with pitcher wins and losses. This is perhaps the most useless, least informative of traditional statistics. Sometimes, whether a pitcher wins or loses a ballgame depends upon the quality of his pitching. More often than not, however, it depends upon run support, the defense playing behind him, his teammates in the bullpen and the quality of the opposing starting pitcher. We've all seen a guy take the mound, slog through five innings, give up four runs and "earn" the W thanks to shutdown relief work and healthy run support. We've also seen a guy pitch a near gem, holding the opposition to a single run for almost the entire game, but then take the loss solely because his counterpart was just a bit better. So therefore he "earns" the loss.

A notable — and extended — example of the latter phenomenon is Nolan Ryan in 1987. That year, he won the NL ERA title with a 2.76 mark and allowed only 3.19 runs-per-game, but he finished a paltry 8-16, mostly because of the miserable run support his Houston Astros teammates gave him. Was Ryan a bad pitcher that season? Of course not. He did what a pitcher is supposed to do — keep runs off the board. That his offense didn't hit for him was utterly beyond his control.

Too often, we make value judgments based on pitcher win-loss records. A starter may have a low runs-per-game, but since he's not getting the wins we assume he lacks some inner fortitude necessary to get the job done. That may be true in isolated instances, but most often it's nonsense.

On the other side of things, take White Sox right-hander Stan Bahnsen in 1972. That season, Bahnsen won 21 games and posted a runs-per-game of 3.82. It sounds like a fine season until you learn that the average American League hurler in 1972 gave up only 3.47. So Bahnsen won 21 games, but in terms of preventing runs he was comfortably worse than the average pitcher in his league. If you gave Bahnsen league-average run support in his starts that season, then he would've finished with a 17-20 record, which would've been more indicative of the quality of his pitching.

Or you could look at Randy Johnson in 2006. The Big Unit's 17 wins ranked fourth in the AL last season, but his runs-per-game (5.49) was more than a half-run worse than the AL average (4.97). Johnson was able to win 17 games because his team, the Yankees, boasted the best offense in baseball.

Randy Johnson won 17 games in 2006, but that has more to do with Yankees bats than his performance. (Elsa / Getty Images)

Over the course of an entire career, wins and losses sometimes (but not always) tend to balance out in terms of run support, bullpen support and the other factors listed above. In the span of a single season, however, they absolutely do not. It's a shame that Cy Young voters pay so much attention to win-loss records, but that doesn't mean you have to make the same mistake.

Relievers have the save and the hold. The usefulness of the save statistic is undermined by the fact that a closer can notch one by protecting a three-run lead for a single inning. Suffice it to say, that's not a high-leverage situation, and it shouldn't be considered a save. An improvement is the "tough save," which entails saving the game with the tying run on base, but it's not often you hear about tough saves. In any event, the save rule should be amended to eliminate the "one inning, three-run lead" variant.

As for the hold, it's a well-meaning attempt to quantify the contributions of middle relievers, but it misses the mark badly. A pitcher garners a hold whenever he enters the game in a save situation, records at least one out, and leaves the game without ever having given up the lead. He can also earn a hold if he pitches at least three innings without giving up a lead. There's nothing particularly wrong with that first set of parameters (the second allowance leaves too much room for lousy pitching), but the problem is that it's not only hold rule.

The aforementioned hold rule is the one developed by STATS, Inc. The other, newer, and inferior one was developed by Sports Ticker and USA Today. It stipulates that all a pitcher needs to do to log a hold is enter the game in a save situation and leave the game with the save situation intact.

Canny readers will observe that the pitcher need not record an out. So it's possible for a reliever to start an inning with a one-run lead, load the bases, exit the game with no one out and adorn his stat sheet with a hold. Sports Ticker, to its credit, doesn't assign holds to pitchers who protect a big lead for three or more innings, but the utter silliness of earning a hold without retiring a batter renders their version useless. In summary, both versions of the hold rule have weaknesses, but the Sports Ticker version is the less desirable of the two. Moving on …

You may have noticed above that we generally relied on Runs-Per-Game or Run Average (RA) rather than the far more familiar Earned Run Average (ERA). There's a reason for this: ERA is also a highly flawed measure.

This can be said for two reasons: one, ERA is too forgiving toward the pitcher following the commission of a fielding error, and, two, the scoring of errors is an imperfect process that yields imperfect information. The matter of ERA and errors is the point at which pitching and defense, insofar as this discussion is concerned, overlap.

On the first point, the earned-run rule assumes that the pitcher's job ends as soon as a fielding error is committed. Why else would the rule let the pitcher off the hook for whatever follows a botched out? If a major-league hurler can't shake off a pitching miscue, buckle down and perform as he should, then the numbers should reflect that failing. ERA doesn't.

On the second point, the scoring of errors — i.e., determining when a play can be made with "normal effort" — is a highly subjective process. What's an error to the eyes of one official scorer in one park might not be to the eyes of another one in another park. Moreover, errors, while they assess the ability of a player to make the routine plays, have no mechanism to evaluate fielding range, which is just as important. Put another way, a slow-of-foot fielder with a bad first step can't make an error on a ball he never got to in the first place; the fielder with better range can make an error on that same ball. In essence, the error/earned-run rule penalizes pitchers who toil in front of fielders with poor range because those fielders aren't getting to as many balls. Again: No reach-y ball, no make-y error on play.

Fortunately, we have better tools at our disposal than pitcher wins and losses, ERA and errors. Next time, we'll take a look at them.

Redsland
04-26-2007, 10:34 AM
The aforementioned hold rule is the one developed by STATS, Inc. The other, newer, and inferior one was developed by Sports Ticker and USA Today. It stipulates that all a pitcher needs to do to log a hold is enter the game in a save situation and leave the game with the save situation intact.

Canny readers will observe that the pitcher need not record an out. So it's possible for a reliever to start an inning with a one-run lead, load the bases, exit the game with no one out and adorn his stat sheet with a hold. Sports Ticker, to its credit, doesn't assign holds to pitchers who protect a big lead for three or more innings, but the utter silliness of earning a hold without retiring a batter renders their version useless.
This answers our question of a week or two ago about how it was that Mike Stanton earned a hold in a game in which he failed to record an out.

Eric_Davis
04-26-2007, 05:21 PM
http://msn.foxsports.com/mlb/story/6740056

Over the course of an entire career, wins and losses sometimes (but not always) tend to balance out in terms of run support, bullpen support and the other factors listed above. In the span of a single season, however, they absolutely do not. It's a shame that Cy Young voters pay so much attention to win-loss records, but that doesn't mean you have to make the same mistake.

Dayn Perry

But, here he has made a mistake. The writer here is trying to skew with an opinionated word here and there his personal views on a subject that he knows is proven to be true. The word he chooses to change from the original text (he probably got it from Bill James, who wrote the exact same paragraph in the early 80's) is the word "sometimes", when the words that were there before were akin to "most of the time" followed by, "but not always" because nothing is ever always true, and in the case of the Bill James study on the subject, the smaller the sample size, i.e., short careers, the "but not always" was a more frequent occurrence.

The timing of this follows the game-thread discussion last night of Ben Sheets and his poor win-loss record.

Everyone (but not everyone) believes Ben Sheets is always a victim of poor run support. Well, here's the facts:

Covering his rookie year 2001 through 2006 he had the following number of starts (he's never appeared in relief):

25, 34, 34, 34, 22, 17

That's enough starts to use his win-loss record as a measure of his quality as a pitcher provided he was given a fair amount of run support.

The following shows the run support for: (A) him as a starter, (B) All Milwaukee starters, which include his starts, (C) Average number of runs scored per game in the NL

YEAR.....2001.....2002.....2003.....2004.....2005. ....2006

NLAvg....4.75.....4.49......4.65......4.67......4. 50......4.80

MilStar...5.26.....4.08......4.75......4.39......4 .89......4.96

Sheets...5.83.....4.28......4.69......3.53......3. 73......4.33


So, in his 25 starts in 2001 he got more than a run per game than the average NL team and more than half a run than the average Brewer pitcher.

The next year in his 34 starts, he got about a quarter more run than the average Brewer pitcher (take his 34 starts out and it would be a quarter more), and 1/5th less than the average NL team.

The next year in his 34 starts he's right at league average and average for a Brewer pitcher.

So, through 93 starts, even though he's received more run support than the average Brewer or the average National League would score, his W-L record is an abysmal 33-39.

There's a reason for this that I'll get to at the end and it's because of what's in his head....what's between the ears.

So, the next year, 2004, in his 34 starts he had poor run support, more than a run less than the NL average, and almost a run less than the average Brewer starter.

Again, over a career, these things balance themselves out.

The next year in 22 starts (2005), he was 3/4th's of a run less than the NL average and more than a run less than the average Brewer pitcher.

Last year in 17 starts he was half a run less than the NL average and over half a run less than the average Brewer starter.

In this stretch of 73 starts (missing 25 starts doesn't make you a good pitcher either as it becomes part of who you are, AKA Mark Prior and Kerry Wood) he had poor run support at about 3/4's of a run less than the average NL team would score. His win-loss record for these 73 starts was 28-30, which was pretty good for the run support that he got, but remarkably consistent for his career, as it doesn't seem to matter how many runs he gets.

Here's why Sheets is what he is:

In his 62 wins, his ERA is 2.13 with an OPS of .597.

Those are Hall-of-Fame numbers. He can and will pitch as well as anyone in the Majors......36% of the time.

42% of the time he'll pitch like crap. His 71 losses have an ERA of 5.77 with an OPS of .834. That's utterly abysmal. He would be hung to dry in REDSZONE. It's as if he was facing a lineup of 9 Carl Yastrzemski's (lifetime OPS of .841).

His other 38 starts, 22%, that were no-decisions had a respectable ERA of 3.52. Let's say he somehow could have gotten a run or two more scored for him while he was out on the mound for every one of those games and everything would have bounced right and his relievers were perfect the rest of the way, then he might have won 14 of those starts (with no more losses). That would still only give him a record every year where he wins just one more game than he loses.

Here's Sheets' main problem:

Once Sheets walks a batter, or hits a batter, or gives up a single, or a guy reaches first on an error, he changes. Throughout his entire career when there's a runner on first, and no one else is on base, regardless of how many outs there are, he chokes. He will consistenly give up the extra-base hit in this situation.

His line with a runner on first regardless of how many outs:

.283 AVG; .513 SLG

If First and Third are occupied, he really chokes:

.353 AVG; .510 SLG

Sheets doesn't walk people and certainly not free-swingers like the REDS have usually been, so he tends to dominate us.

Here's an interesting stat:

Sheets' favorite catcher?

Chad Moeller

He's caught 52 of his 171 starts going into this year and opponents had a .649 OPS when he and Sheets worked together. When Sheets works with other catchers his OPS against is .770.

Moeller has caught 45% of all batters Sheets has faced in his career.

reds1869
04-26-2007, 05:28 PM
All stats are simply "snapshot" tools that give a mathematical look at one or several aspects of a player's/team's performance. None of them are perfect but they can all be useful.

RedsManRick
04-26-2007, 06:09 PM
Great post ED. I posted something in a similar vein a few weeks back (http://www.redszone.com/forums/showthread.php?t=56487). That is, looking at pitch stats in the aggregate often tells a misleading story. Not all 3.00 ERAs are created equal. 30 starts of 3.00 ERA will tend to produce more team wins than will 15 starts of 0.00 ERA and 15 starts of 6.00 ERA.

I've also noticed anecdotally with Sheets that he's less consistent than many pitchers of his aggregate quality. A CG shutout one night, a 5 IP, 4 ER performance the next. Yes, it averages out quite well, but it doesn't mean he put his team in a position to win every game.

I realize you've taken things even more granular level. I didn't realize how poorly Sheets fared with men on base, though I am curious how those numbers compare with other pitchers in those situations. Does he go from Cy Young to average or from Cy Young to Joey Hamilton?

jojo
04-26-2007, 07:37 PM
There's a reason for this that I'll get to at the end and it's because of what's in his head....what's between the ears.



Here's Sheets' main problem:

Once Sheets walks a batter, or hits a batter, or gives up a single, or a guy reaches first on an error, he changes. Throughout his entire career when there's a runner on first, and no one else is on base, regardless of how many outs there are, he chokes. He will consistenly give up the extra-base hit in this situation.

His line with a runner on first regardless of how many outs:

.283 AVG; .513 SLG

If First and Third are occupied, he really chokes:

.353 AVG; .510 SLG

Sheets doesn't walk people and certainly not free-swingers like the REDS have usually been, so he tends to dominate us.

I'm not sure I understand your complete argument but the part of it that argued Sheets isn't mentally tough wasn't very compelling. For instance, it was argued that Sheets chokes with a runner on first or with a runner on first and third. But looking at all of his splits, he apparently regains his composure when a runner is on second, or when one is on third, or when runners are on first and second or when runners are on second and third, or even he improves with bases loaded.

Basically you're mistaking the effects of sample size for a meaningful trend. This is illustrated by the regression to the mean seen when viewing Sheets' "clutchiness" with larger samples. For instance consider this:

Bases empty: PA: 2787; .252/.291/.408; OPS: .699;
Runners on: PA: 1855; .268/.314/.434; OPS: .748;
RISP: PA: 1080; .257/.313/.374; OPS: .687;

Compare those PA totals with the 102 PA associated with runners on first and third. Nothing in the above splits support the argument that Sheets isn't mentally tough. Infact, his lowest OPS is with RISP. That argument is further refuted by this "clutch" splits. For instance:

2 outs, RISP: .246/.319/.357; OPS: .676;
late & close: .251/.276/.428; OPS: .704;

The rest of his clutch stats pretty much are consistent with the above. In fact, these splits argue Sheets is a mentally tough competitor since they indicate the situation does not effect his performance...

westofyou
04-26-2007, 08:37 PM
For instance, it was argued that Sheets chokes with a runner on first or with a runner on first and third.Is that "chocking" or getting a different result pitching from the Stretch as opposed to from a Windup?

jojo
04-26-2007, 09:03 PM
Is that "chocking" or getting a different result pitching from the Stretch as opposed to from a Windup?

I first thought it could've been a stretch issue but he doesn't have the absurd splits in all circumstances where runners were on (for instance look at his numbers with runners on first and second).

Eric_Davis
04-27-2007, 01:45 AM
Jojo and WestOfYou. I'm doing this through a tinted window. I haven't seen Ben Sheets personally enough to make any judgements about him. All I have to work with are his stats.

And seeing someone on TV is nowhere near the same as seeing him in person repeatedly every day, also.

But, Ben Sheets certainly earned his 71 losses with that 5.77 ERA. I don't need to see him to know that there's some reason for this that lies in the makeup of Ben Sheets.

After I wrote the above Post I know I left a heck of a lot out. I wonder myself if it's from the stretch vs. the windup. I saw the other numbers, too, but when there's someone on first with 2nd base open (this eliminates the bases-loaded and 1st & 2nd scenarios), the numbers are alarming. With over 901 plate appearances from this situation, it's not a coincidence or a small sample.

There's something that Ben Sheets does differently when trying to hold the runner on first that effects his pitching.

Now, that aside, Sheets is in his 7th year. He has to be a better pitcher now than he was 4, 5, and 6 years ago. The quality of all of his pitches should be better while he has probably added one or two pitches, too.

I didn't check it out as it would have taken hours upon hours further, but I'd venture to guess that the last three years, when he's been healthy, have been producing better numbers when there's a runner on 1st with 2nd base open. Well, I'm going to go check it out real quick, now.

Then there's the issue of the 27 missed starts the last two years. Now that's effecting his consistency in a different way. Don't know what it is, but it's got to be doing something.

I'd like Chad Moeller's opinion on this subject. He knows the answer.

Anyway, overall his career has been an average one. Right now, I'd say he's above average. If he can just stay healthy, then right now, I'd say he's in the group of the 2nd best 10 starters in the league.

Of course, he is on my fantasy team. :)

Eric_Davis
04-27-2007, 02:23 AM
After checking his numbers for the latter years when there's a man on 1st with 2nd base open, it does reflect what I said above about his injuries possibly effecting him and that he is better than he was during his first three years in the league in this area.

In 2004 he had his 3rd consecutive 34-start season (and a brilliant 2.70 ERA) just as a reminder. 161 plate appearances he faced a runner on 1st with 2nd base open. He walked 1 frickin' batter out of all of them. (It was probably the ump's fault.). Though not at Ben Sheets' best, his numbers were better than his career average in this spot. Batters hit .268 and slugged .425 against him.

In 2005, he continues to get better, but he finished with only 25 starts. He had his first health issues. But, before all that he solved his problem where he had problems with a runner on 1st with 2nd base open. Again, always around the plate, he walked only 4 batters in 104 plate appearances. Batters hit only .240 against him, and slugged a Mendoza-like .365.

In 2006, he injured himself the first day of Spring Training and struggled through 17 starts while compiling 66 plate appearances with a runner on 1st and 2nd base open. Remarkably, he only walked one batter, but he regressed back to his early days allowing hitters to bat .310 and slug .603 against him.

Despite that tremendous Opening Day Complete Game earlier this month, he's not done so well with a runner on 1st and 2nd base open this year. With already 26 plate appearances in those situations, he's walked 3 batters. Unusual for him. Batters have hit .286 and slugged .523 against him.

I'll be watching him through the internet all year as I have him on my fantasy team, but if he's injured still, then that should help the REDS catch the Brewers.

RedsManRick.....I don't know what the norm is for National League Starters in that category.

Eric_Davis
04-27-2007, 02:50 AM
Rick, I really can't find the stat for league average there, but I tried to come up with a similar pitcher.

I picked Aaron Harang since he started games one year after Sheets did. He only has 3/4th's the number of Plate Appearances as Sheets with a runner on 1st and 2nd base open in his career (665 plate appearances), but it's just a point of reference.

Harang's numbers in that situation:

With a runner on 1st only: .287 Avg and .443 Slg

With runners on 1st & 3rd: .242 Avg and .348 Slg

Harang's career win-loss going into this year is 47-43. (6-7 was with the A's)

Hey, Jojo. Guess who's 2nd in the NL in run support so far this year? Yes, our own Aaron Harang.....and he's 3-0. You gotta love that.

Give me Harang over Sheets right now for the rest of their careers.

jojo
04-27-2007, 11:02 AM
Dayn Perry basically argued that a pitcher’s W-L record is a useless metric for evaluating the pitcher’s performance largely because a pitcher’s record depends upon a huge number of variables that are out of the pitcher’s control. Perry's reasoning was challenged by arguing that Sheets’ W-L record is representative of his ability because Sheets’ record is a product of his mental makeup. That argument was largely based upon Sheets' W-L splits and his situational splits with runners on first and with runners on first and third. I agree with Perry that W-L records are useless and don't agree with the conclusion that Sheets has a poor mental makeup. Sheets’ splits simply don't support the notion that he's not a gamer.

Here’s the first part of the argument again where Sheets’ W-L splits are the focus:


Here's why Sheets is what he is:

In his 62 wins, his ERA is 2.13 with an OPS of .597.

Those are Hall-of-Fame numbers. He can and will pitch as well as anyone in the Majors......36% of the time.

42% of the time he'll pitch like crap. His 71 losses have an ERA of 5.77 with an OPS of .834. That's utterly abysmal. He would be hung to dry in REDSZONE. It's as if he was facing a lineup of 9 Carl Yastrzemski's (lifetime OPS of .841).

His other 38 starts, 22%, that were no-decisions had a respectable ERA of 3.52. Let's say he somehow could have gotten a run or two more scored for him while he was out on the mound for every one of those games and everything would have bounced right and his relievers were perfect the rest of the way, then he might have won 14 of those starts (with no more losses). That would still only give him a record every year where he wins just one more game than he loses.

Here's Sheets' main problem:

Once Sheets walks a batter, or hits a batter, or gives up a single, or a guy reaches first on an error, he changes. Throughout his entire career when there's a runner on first, and no one else is on base, regardless of how many outs there are, he chokes. He will consistently give up the extra-base hit in this situation.

First, given the numbers above, Sheets has ranged from superhuman (wins splits) to merely awesome (about a run better than a league average starter based upon ERA in his ND splits) in almost 60% of his starts. Yet somehow he has a losing record as a starter????? To me Sheets is the poster boy for Dayn Perry’s argument. Isn’t that the point? W-L records often don’t reflect the pitcher’s true performance in the past and they have no ability to predict a pitcher’s future performance. It’s a flawed metric.

Let’s address the argument that Sheets’ W-L splits indicate he’s a choker.

Sheets:
Wins: G: 62; ERA: 2.13;
Losses: G: 71; ERA: 5.77;
ND: G:38; ERA: 3.52;

Harang:
Wins: G: 50; ERA: 2.19;
Losses: G: 43; ERA: 6.27;
ND: G:40; ERA: 5.45;

R. Clemens:
Wins: G:348; ERA: 1.72;
Losses: G: 178; ERA: 5.78;
ND: G:165; ERA: 3.73;

R. Johnson:
Wins: G: 280; ERA: 1.86;
Losses: G: 148; ERA: 5.89;
ND: G:127; ERA: 3.73;

P. Martinez:
Wins: G: 206; ERA: 1.49;
Losses: G: 92; ERA: 5.46;
ND: G:141; ERA: 3.56;

It appears that Sheets is in good company (these are just the first four guys I looked at) since three of the list are future first ballot HOFers. I don’t think anyone would suggest the pitchers on this list lack mental toughness.

Nobody has argued that pitchers don’t generally pitch better in games they win versus ones they lose. The issue is that W-L record very often doesn’t reflect a pitcher’s true performance. A pitcher can pitch well and lose. He can pitch awful and win. Finally, with the exception of Harang, every guy on the list has been significantly better than league average in their NDs yet W-L records completely ignore this significant portion of their careers.


After I wrote the above Post I know I left a heck of a lot out. I wonder myself if it's from the stretch vs. the windup. I saw the other numbers, too, but when there's someone on first with 2nd base open (this eliminates the bases-loaded and 1st & 2nd scenarios), the numbers are alarming. With over 901 plate appearances from this situation, it's not a coincidence or a small sample.

Sheets had 775 PA with a runner only on first in his career. I'm not sure where the 901 comes from. There are a couple of things i'm struggling with concerning the logic above. First, it seems to be arguing that Sheets only pitches from the stretch with runners on first. If so, that’s not accurate. For instance, Sheets pitches from the stretch with runners on first and second too (check out mlb.tv to verify his use of the stretch with runners on the bases). If Sheets has issues with pitching out of the stretch, all splits associated with pitching out of the stretch should be similarly affected. The second issue I'm struggling with has already been addressed-sample size and variability. I think you’re misinterpreting the effects of sample size as a true trait:


There's something that Ben Sheets does differently when trying to hold the runner on first that effects his pitching.

But here are his *runner on first* splits from his last 4 seasons:

'03: .265/.295/.560 PA: 182;
'04: .250/.261/.409; PA: 138;
'05: .226/.261/.369; PA: 90;
'06: .362/.388/.723; PA: 53;

There is nothing consistent there. Given the variability (who knows the underlying reasons be they simple randomness, injury-related or mixtures of lots of things), 775 PA may not be a large enough sample size to reliably make a conclusion. In any event, this split certainly isn't sufficient to back the claim that Sheets has issues with mental toughness (especially when the majority of his *clutch* splits collectively refute the claim).


Anyway, in summary: The argument that Sheets’ mediocre record is a product of a lack of mental toughness can’t be supported by his splits. One of the first statements by Dayn Perry in this series alluded to the importance of context. The argument for the importance of W-L records based upon Ben Sheets suffers greatly because of a lack of context IMHO. I suspect Ben Sheets’ real problem regarding his W-L record is that he’s played his career in Milwaukee…

Eric_Davis
04-27-2007, 02:11 PM
Jojo, you and Dayne must use the same editors. Perry cleary made his case that W-L is not a valid measure for pitching performance under "small samples only". He repeatedly referenced single years as examples. Which, as I've also repeatedly said, I agree with him there 100%. He tried to slip in that discussion the word "sometimes" for "most of the time" when referring to career numbers that have enough time to balance out run support.

I know where Dayne Perry got his information. Bill James was the first one to discuss and go in length the facts regarding this issue way back in the early 80's.

Perry abused his journalism responsibility for stating the truth and flat out "made up in his own mind" that he was going to change what is proven to be true, the "most of the time", to what is false, the "sometimes".

Career Win-Loss records, if it's a long enough career, do measure a pitcher's abilities to win games.

Eric_Davis
04-27-2007, 02:17 PM
Sheets had 775 PA with a runner only on first in his career. I'm not sure where the 901 comes from.




The other 126 plate appearances on with runners on 1st and 3rd. That's still a situation where there's a runner on 1st with 2nd base open. For all the examples where I said "runner on 1st with 2nd base open" I included all the data from 1st and 3rd, too, as it's a similar scenario where he's still trying to keep the double-play in check and keep the runner from 1st from getting a good jump.

Eric_Davis
04-27-2007, 02:28 PM
I think you’re misinterpreting the effects of sample size as a true trait:



But here are his *runner on first* splits from his last 4 seasons:

'03: .265/.295/.560 PA: 182;
'04: .250/.261/.409; PA: 138;
'05: .226/.261/.369; PA: 90;
'06: .362/.388/.723; PA: 53;



You used the wrong numbers. You were talking about pitching from the stretch, yet the numbers above aren't the numbers where there's a "runner on 1st with 2nd base open". You have to add in the numbers for 1st and 3rd, too. That's another 126 plate appearances. For example, in 2003 with runners on 1st and 3rd, he had 34 plate appearances and batters hit .414 and slugged .690. In 2004, batters hit .381 and slugged .524 with runners on 1st and 3rd in 23 plate appearances.

Eric_Davis
04-27-2007, 02:33 PM
Given the variability (who knows the underlying reasons be they simple randomness, injury-related or mixtures of lots of things), 775 PA may not be a large enough sample size to reliably make a conclusion. In any event, this split certainly isn't sufficient to back the claim that Sheets has issues with mental toughness (especially when the majority of his *clutch* splits collectively refute the claim).




But, it's 901 plate appearances, and that's a huge sample size.

We'll know what type of hitter Josh Hamilton is going to be after he has 901 plate appearances....about July of next year. Or is that not enough of a sample size to determine that?

jojo
04-27-2007, 03:16 PM
Jojo, you and Dayne must use the same editors.

There's no need to make this personal. This has been an interesting discussion IMHO.

Why is Perry being made out to be such a pariah? He's an accomplished baseball writer having cut his teeth at BP among other places.


Perry cleary made his case that W-L is not a valid measure for pitching performance under "small samples only". He repeatedly referenced single years as examples. Which, as I've also repeatedly said, I agree with him there 100%. He tried to slip in that discussion the word "sometimes" for "most of the time" when referring to career numbers that have enough time to balance out run support.

It seems to me that Perry is pretty clear about what he means without qualifying or softening his opinion with modifiers as he wrote this: "Let's begin with pitcher wins and losses. This is perhaps the most useless, least informative of traditional statistics."

Basically the title of the piece says it all, " Don't trust the W"....

Here's the comment you're specifically alluding too:

"Over the course of an entire career, wins and losses sometimes (but not always) tend to balance out in terms of run support, bullpen support and the other factors listed above."

I just don't see the conspiracy or the plagiarism here. Perry is basically saying, W-L records are worthless because they are suspect both on a season to season basis AND even over the long haul of a pitcher's career.


I know where Dayne Perry got his information. Bill James was the first one to discuss and go in length the facts regarding this issue way back in the early 80's.

I'm not sure why this invalidates Perry's argument.


Perry abused his journalism responsibility for stating the truth and flat out "made up in his own mind" that he was going to change what is proven to be true, the "most of the time", to what is false, the "sometimes".

I'm really losing you here. He made a valid argument whose conclusion was supported by it's premises. I guess he is guilty of "flat out "making up in his own mind" but is wasn't before considering the evidence.


Career Win-Loss records, if it's a long enough career, do measure a pitcher's abilities to win games.

I'm not trying to be snarky or cause a fuss but lets pause for a minute and consider the big picture. Here are Sheets' career totals:

ERA: 3.85; FIP: 3.71; K/9: 7.70; BB/9: 1.88; K/BB: 4.08;

Those numbers are ALL significantly better than major league average for starters and easily place him within the top 10 best starters in the national league yearly.

Are you really meaning to argue that Sheets' 62-71 record as a starter is indicative of the type of pitcher he's been?

It's astonishing that Sheets was offered as an example to refute the idea that W-L are useless measures of pitcher performance. You couldn't pick a better example to illustrate Perry's point IMHO.

jojo
04-27-2007, 03:27 PM
You used the wrong numbers. You were talking about pitching from the stretch, yet the numbers above aren't the numbers where there's a "runner on 1st with 2nd base open". You have to add in the numbers for 1st and 3rd, too. That's another 126 plate appearances. For example, in 2003 with runners on 1st and 3rd, he had 34 plate appearances and batters hit .414 and slugged .690. In 2004, batters hit .381 and slugged .524 with runners on 1st and 3rd in 23 plate appearances.

I didn't use the wrong numbers. I broke down his situational split with a runner on first by year to illustrate the extreme variation in his performance during that situation. I didn't break down his first and third splt because with so few PA a year, it would be meaningless.

Also, believe it or not, pitchers actually pitch from the stretch even when a runner is on second.... Generally if a runner is on base then a pitcher will pitch from the stretch in order to prevent a stolen base AND to limit the ability of a baserunner to take large leads. I'm not sure why you've focused in on just situations where there is "runner on 1st with 2nd base open". Actually, I'm fairly certain why you did with Sheets....

jojo
04-27-2007, 03:32 PM
But, it's 901 plate appearances, and that's a huge sample size.

We'll know what type of hitter Josh Hamilton is going to be after he has 901 plate appearances....about July of next year. Or is that not enough of a sample size to determine that?

An appropriate sample size is largely determined by the variation in the sample...

In general more is better.

Look, you basically cherry picked Sheets' situational splits and ignored the context of his remaining splits.

I don't know why Sheets has had some funky numbers with runners on firstbase. I can tell you this though-it hasn't been a consistent phenomenon across his career, it has nothing to do with his mental toughness, and it certainly doesn't prove W-L records are good metrics for evaluating pitcher performance.

RedsManRick
04-27-2007, 05:53 PM
ERA: 3.85; FIP: 3.71; K/9: 7.70; BB/9: 1.88; K/BB: 4.08;


You could put those numbers up over 2000 IP and if you only went 4 IP per start you'd never win a game. Though perhaps not the exact point ED is making, the point I would make is that you can be a good pitcher with a poor W-L record and it's not just due to poor run support. There are a variety of reasons why. Maybe you've played in front of a horrible defense. Maybe you pitch a few great games and a lot of mediocre ones.

Wins are assigned based on a series of appearances, the way in which actual performances occur - not an aggregate of incongruous outings. Rate stats are useful, but they don't tell the whole story either. Should you use Wins and Losses to judge how "good" a pitcher is? Maybe not. But if you want to get picky, you probably shouldn't be asking such a horribly generic question in the first place.

jojo
04-27-2007, 06:47 PM
You could put those numbers up over 2000 IP and if you only went 4 IP per start you'd never win a game. Though perhaps not the exact point ED is making, the point I would make is that you can be a good pitcher with a poor W-L record and it's not just due to poor run support.

First that's a hypothetical that is a virtual certainty to never happen.

Second, ED was actually making the opposite point in essence arguing Sheets' W-L record correctly indicates Sheets is not a good pitcher since he in fact had good run support. ED argued Sheets w-l record is what it is because Sheets is a choker.


There are a variety of reasons why. Maybe you've played in front of a horrible defense. Maybe you pitch a few great games and a lot of mediocre ones.

Wins are assigned based on a series of appearances, the way in which actual performances occur - not an aggregate of incongruous outings. Rate stats are useful, but they don't tell the whole story either. Should you use Wins and Losses to judge how "good" a pitcher is? Maybe not. But if you want to get picky, you probably shouldn't be asking such a horribly generic question in the first place.

A couple of points...first it sounds like we pretty much agree that w-l record is pretty flawed...

Second, how best to evaluate a pitcher isn't a generic question IMHO.... this is probably illustrated in this thread.....

Third, it is really this straightforward: Evaluating a pitcher based upon things he actually controls (i.e. rate stats, FIP etc) is a superior approach to evaluating him based upon metrics that are so dramatically influenced by factors out of the pitcher's control that they are incapable of accurately predicting a pitcher's future performance (i.e. ERA or w-l record).

I stand by my earlier statement 100%:


If you look at a pitcher's peripherals (K%, BB%, K/BB, GB%, HR/FB%, LOB%, and BABIP) in conjunction with a superior metric like FIP you really can get a very complete picture of why a pitcher performed the way he did and a really good feel for what you should expect he'll do in the future.

If you look at ERA and w-l record you'll get fooled alot...

Anyway, that's my two cents....

Eric_Davis
04-27-2007, 07:57 PM
Here's the comment you're specifically alluding too:

"Over the course of an entire career, wins and losses sometimes (but not always) tend to balance out in terms of run support, bullpen support and the other factors listed above."

I just don't see the conspiracy or the plagiarism here.


Because he doesn't provide any evidence of his assertion in this article that over the course of a pitcher's career wins and losses are not indicative of his ability as a pitcher.

I've stated this already. Don't know why I have to state it again.

He doesn't provide any evidence in this article to his assertion that wins and losses are not indicative of a pitcher's ability over the length of a career.

Let me again state that I agree with all of the rest of his article.

Eric_Davis
04-27-2007, 08:02 PM
He made a valid argument whose conclusion was supported by it's premises. I guess he is guilty of "flat out "making up in his own mind" but is wasn't before considering the evidence.


How can you say this? Give me one example in his article where he used evidence to support the notion that someone's career win-loss numbers were not indicative of the quality of his pitching.

The fact is...he didn't. He just slipped the comment while instead using evidence to support that a single year's win-loss record is not reflective of a pitcher's performance, which I agree with him 100%.

Does anybody have an old Bill James book from the early 80's on this subject? I don't have mine anymore. I know they're rare.

Eric_Davis
04-27-2007, 08:07 PM
I just hope Sheets does well in his next start. He's on my fantasy team.

But, I hope the Brewers lose all the other games.