I sure hope the Red Sox don’t start the season with a wicked slump. If so, people might blame this blog.
When we solicited your questions for Bill James, the Sox’s data wizard, we didn’t know there’d be so many questions and that Bill would answer just about all of them. I hope he found some time over the past few days to do his actual work. (He was also featured this weekend on a little TV program called 60 Minutes.)
So if you are a baseball fan, feast on James’s answers, below. And if you are not a baseball fan, you may find that you have become one by the end of this Q&A. James’s answers are valuable on many levels, perhaps above all for his reality-based view of the world. For instance, when asked if his field of sabermetrics has “pretty much squeezed the last drop of new insights” out of baseball, here’s what James said:
We haven’t figured out anything yet. A hundred years from now, we won’t have begun to have the game figured out.
Thanks to Bill James and all of you for participating. Enjoy.
Q: Using various statistics over a player’s lifetime, and comparing them to “league norms,” is it possible to determine which players may have used steroids?
A: Absolutely not, no. The problem is that many different causes can have the same effects. If a player used steroids, this could cause his home run total to explode at an advanced age — but so could weight training, Lasix surgery, better bats, playing in a different park, a great hitting coach, or a good divorce. It is almost always impossible to infer specific causes from general effects.
Q: Can you tell us about a time when you thought numbers were misleading and why?
A: I would say generally that baseball statistics are always trying to mislead you, and that it is a constant battle not to be misled by them. If you want something specific — pitchers’ won-lost records. And if you want a specific pitcher, Storm Davis, 1989.
Q: Bill, I love your work and I am a longtime reader. I am also a Yankees fan and fear that your work with the Sox have tilted the rivalry in the Sawx’s favor. Could you please take Rob Neyer and go run the Royals? Please keep up the outstanding work and I hope Hank and Hal could entice you to leave the Sawx like Johnny Damon and come to work for the Evil Empire.
A: I appreciate the kind words. Are you trying to get me killed, or just fired?
Q: Why can’t the Chicago Cubs get into the World Series? Is it the small park? Low salaries? The curse of the billy goat? Does sabermetrics provide any insights?
A: Talking about the origins of it — the Cubs fell into a trench in history in the late 1930’s, when almost all baseball teams built farm systems, but the Cubs for several years refused to do so. This put them behind the curve, crippled them for the 1950’s, and really the organization did not fully overcome that until about 1980.
Since 1980 they have had several teams that could have wandered into a World Series, with better luck. They haven’t had any one overpowering team — like the 1984 Tigers, or the 1992 Blue Jays, or the 1998 Yankees — that was so good that it demanded a seat at the Last Banquet of Fall. And, unless you have a team that good, you’re at the mercy of the fates.
Q: Based on your statistical analysis, how do you feel about the Yankees’ young prospects, namely Chamberlain, Kennedy, and Hughes, making a huge (positive) difference for the Yankee pitching staff?
A: The same as I feel about our young pitching prospects with the Red Sox, really — Buchholz and Lester and Masterson. When you’re depending on young pitching, you’re vulnerable. Some of these guys are going to be very good, but probably not all of them, and there are going to be bumps in the road that will rattle your teeth.
Q: Has sabermetrics pretty much squeezed the last drop of new insights out of traditional counting statistics? If so, what data ought to be collected to improve our understanding of the game? If not, where can the boundaries be pushed?
A: We haven’t figured out anything yet. A hundred years from now, we won’t have begun to have the game figured out.
Q: On average, how many runs does Manny Ramirez’s defense cost the Red Sox? Is there some special adjustment that must be made to evaluate left fielder defensive stats in Fenway?
A: Defensive stats for a left fielder in Fenway are misleading, yes. The small area always makes our left fielder’s range appear more limited than it is.
Q: Generally, who should have a larger role in evaluating college and minor league players: scouts or stat guys?
A: Ninety-five percent scouts, five percent stats. The thing is that — with the exception of a very few players like Ryan Braun — college players are so far away from the major leagues that even the best of them will have to improve tremendously in order to survive as major league players — thus, the knowledge of who will improve is vastly more important than the knowledge of who is good. Stats can tell you who is good, but they’re almost 100 percent useless when it comes to who will improve.
In addition to that, college baseball is substantially different from pro baseball, because of the non-wooden bats and because of the scheduling of games. So … you have to pretty much let the scouts do that.
Q: Are there any baseball rules either in the game itself or for the leagues that you think ought to be changed, removed, or added to increase the entertainment value of the sport?
A: Many, actually. My pet project is a rule to limit pitching changes in the late innings. My rule, specifically, would be this:
1) Each team is entitled to one unrestricted pitching change per game.
2) With the exception of that one unrestricted change, no pitcher may be removed from the game in mid-inning unless he has been charged with allowing a run in that inning. With an exception for injuries, of course.
When you propose a rules change like that, people say, “Oh, you’re changing the way the game has always been.” That’s nonsense. In 1970 major league teams used 1.75 relievers per game. In 1990 they used 2.02 relievers per game, and in 2007 they used 2.97 per game — and the rate of increase in this area is still accelerating.
I’m not trying to change the game with this rule; I’m trying to stop a change in the game that is running amok. There are actually many rule changes like that which I would favor — rules designed to control changes in the game that are occurring, uncontrolled, at a breakneck pace.
Q: Is sabermetrics the Freakonomic analysis of baseball?
A: There are parallels. What I do was heavily influenced by the University of Chicago economists of the 1960’s. I think Freakonomics comes from the same tradition.
Q: Are there certain trends over time in the game as a whole that would indicate some sort of logical “starting point” of the steroids era?
And with increased scrutiny coming upon the clubs and the players, have any of those statistical trends begun to flatten out or perhaps even reverse?
A: On the first issue, no; you can’t find the back border of steroid use in baseball. N.F.L. teams were heavily into steroid use by the mid-1960’s. It’s not reasonable to think that it didn’t hit baseball until 1993, and I think there are indications of steroid use in baseball by the early 1980’s. But there is no moment when it begins.
On the other issue, I was very skeptical how effective the efforts to get rid of the steroids would be — but there is absolutely no doubt now that we have made tremendous strides over the last two years. Just look at the players. They’ve gone back to being normal sized.
Q: Have you personally explored or contributed to any of the current statistical work being done for other sports? Do you think it is possible to quantify the close interaction between players that makes statistical analysis of these sports more difficult, or is it only baseball that provides the “perfect” setting of numerous isolated events?
A: Other sports are capable of analysis. Extraordinarily good work is being done in basketball. Football, like baseball, is a “static” game — a game that moves from pause to pause, from stop to stop. There is tremendous work being done in these sports, and much more will follow.
I personally have never discovered anything about any other sport that I did not later learn had already been discovered by 22 other people. But that’s just me.
Q: Fielding has long been the most difficult element of the game to quantify. What do you believe are the most reliable metrics to measure fielding ability and its effect on the outcome of a game? Are there any new up and coming measurements that particularly excite you?
How high do you think Andruw Jones ranks among the all-time great center fielders, looking only at defense?
A: As to Andruw — I don’t honestly know, but he’s certainly somewhere in the top ten. Mays, Flood, DiMaggio, Speaker, Garry Maddox — he’s up there with those guys.
For many years fielding was difficult to quantify. In the last five years, several different fielding analysis systems have converged on a common point in a way that leaves us with a good deal of confidence in our defensive analysis. John Dewan’s Fielding Bible is perhaps the easiest to understand of the sophisticated efforts to evaluate fielders.
Q: Do you think quantitative analysis in individual sports (like boxing or tennis) will ever reach the level it has in baseball? It seems to me that there is as much baloney involved in coverage and discussions of those sports as there is in baseball, but fans are not as equipped to see through it. I ask because I’ve always thought your primary mission is exposing that kind of thing in baseball (you eat baloney for breakfast).
A: Oh, we do horrible analysis sometimes. There will never be a shortage of B.S. What we do, essentially, is to pick up things that people say and ask “Is that true?” This can be done with regard to almost anything — any sport, including politics. The people who analyze politics on television say absolutely ridiculous things with a frequency that would make the laziest baseball announcer look like Socrates by comparison.
Q: Can you tell us a time when you did an analysis and expected one thing, but the numbers told you something radically different?
A: Well, it happens every day. My “debunking” of the importance of stolen bases came from extended efforts to prove the importance of stolen bases, all of which failed. I remember I used to think that players from California were over-scouted and over-drafted, because the amateurs out there play baseball year around and mature early. It’s not true; the state fully justifies — and more than justifies — the draft picks invested out there.
Q: It seems like investing an additional $5 million or so in the draft each year is a wiser investment than spending it in the F.A. market most of the time. Why do so few teams do this?
A: Well, I remember once having this conversation with my father:
“Dad, do you know what your problem is?”
“No, son, what is my problem?”
“You’re just too poor to get rich.”
Some baseball teams are just operating too close to the margin to have the freedom to make long-term investments. They are, in a sense, too poor to get rich.
Q: I understand that baseball lends itself very well to statistical analysis. But why is there such a lack of objective statistical evaluation in other sports? I would think that N.F.L. teams would be beating down statisticians’ doors to help take the guesswork out of sinking millions into draft players that go nowhere.
A: There actually is a lot of sophisticated analysis of football, and has been for many years, but it has a different tradition than baseball analysis. Football analysis grew within the organizations, out of the film study done by coaches. Thus, the best analysis done in football has usually been proprietary to the teams, and outside the view of the public.
Q: Most major professional sports have a history of experimenting with changes of rules and technology in an effort to find some type of optimum of both entertainment and competitive balance. I’d argue baseball has largely resisted this trend, certain historical influences (the dead ball, steroids, etc.) notwithstanding.
A: That’s exactly right. Most successful sports tend to trim and snip their rules to keep the game interesting. Baseball people like to think their game is perfect, so we drag our flaws forward from generation to generation.
Q: Do you think we will ever see another 300-innings-pitched season from a starter? How could they do it in the past, but not now? Given a Phil Hughes-type pitcher, what is the best regiment he could be given now that could prepare him for 300 I.P. in the future?
A: There is absolutely no way you could train Phil Hughes to throw 300 innings in modern major league baseball. “Ever” is a long time, but I don’t see it. Many different changes in the game are working against that happening — for example, the length of the games, in minutes and hours, and the fact that there is more emphasis now on getting strikeouts.
Unexpected changes occur because the system breaks down at some point. But until it breaks, there are 30 different trends in motion which all have the effect of driving innings by top starting pitchers downward.
Q: Will we see a woman player in the majors in my lifetime? (I’m in my 30’s. And when I say woman player I’m thinking regular contributor as opposed to a one-time gimmick.)
A: Well, there is nothing happening now at lower levels that would tend to cause that as a higher-level outcome. You will certainly see many women General Managers in baseball (and basketball) within a few decades, because there are large numbers of capable women filling lower-level baseball operations positions. You will see women scouts and probably umpires.
But colleges don’t have women’s baseball teams, and high schools don’t. Ninety-nine percent of girls who like to play baseball have been driven to other sports by age 12. It’s hard to see how a woman can wind up in the major leagues under those conditions.
Q: Do you feel, given the right personnel, that some teams should try a four man rotation. If not, why not? If so, which team do you think is best suited and why?
A: I think it is plausible that that could happen and could succeed. I would explain my feelings about it this way: that between 1975 and 1990, two changes were made to reduce the workload of starting pitchers in an effort to reduce injuries. First, we switched from a four-man to a five-man rotation. Second, we imposed pitch-count limits on starting pitchers, starting at about 140 and then gradually reducing that to about 110.
I think it is clear that at least one of those changes was unnecessary, and accomplished nothing. It is possible that both of them were unnecessary and accomplished nothing, but the better evidence is on the side of the pitch limits. I think it is possible, based on what I know, that the starting rotations could go back to four pitchers with no negative consequences.
Q: Is clutch hitting a repeatable/retain-able skill?
A: I don’t know.
Q: Shouldn’t in-game strategic decisions be made by a computer? Or, more to the point, isn’t there always a correct choice?
A: It is totally impossible to isolate the correct strategic choice in almost all real-life situations, for the simple reason that all real-life strategic situations involve dozens of variables, many of which have not been thoroughly tested by trial. People who think that they know when a manager should bunt and when a manager should pitch out and when a manager should make a pitching change are amateurs. People who have actually studied these issues know that the answer disappears in a cloud of untested variables.
Q: I enjoyed your article in Slate last week about judging when a basketball game is “over.” Surely you must have a similar metric for baseball?
My friends and I attend baseball games often, and our rule is this: when the lead is greater than the number of half-innings left, it’s “over” and there is no shame in leaving. I think this rule is less fail-safe than your basketball one, but at least it’s easy to remember and calculate. How do you decide when a lead in baseball is insurmountable?
A: I do not have a similar trick for baseball games, no, but thanks for your kind words about the Slate article. I remember one time, about 1982, the Kansas City Royals were getting pounded senseless in the early innings, but they scored two runs in the fourth inning. Our announcer, Fred White, began a sentence “Well, if you want to dream a little bit, if we can just score two runs an inning … well, no, wait a minute. We’d have to have some threes and fours in there somewhere.” When you get into that position, the game is pretty much over.
Q: What new statistic are M.L.B. clubs using now with regularity that they didn’t use two years ago? What will be your answer in two years?
A: The pitch by pitch data — the pitch fx and similar data from Baseball Info Solutions — gives us dramatically better detail about what pitches pitchers are throwing how often and how effectively. It will take us twenty years to figure out what some of this stuff means, but it is clearly generating a lot of excitement.
Q: I know that a large issue in baseball is determining the quality of defense, especially at the individual level. Concerning basketball, do you have any insight in determining both the quality of team and individual defense?
A: The interesting question is why defense is so much more difficult to quantify than offense in all sports. Perhaps defense by its nature involves more interaction between individuals than individual actions, and perhaps the way to get past that is to embrace the concept and measure combinations of players.
Q: I’m a life-long Cub fan. As a kid in the sixties, I really disliked Ron Santo because it seemed to me that when he was at bat in an important game situation, he struck out. When the game wasn’t on the line at all, that’s when he got his home runs (I was too stupid to appreciate his defensive prowess). Is there a stat that measures the clutch hitting of a player? Was my perception of Santo correct, or did he actually hit well in the clutch?
A: At Bill James Online we have a definition of a clutch at bat, based on a series of indicators — the score, the inning, the number of men on base, the number of outs, the opponent, where you are in the pennant race, etc. We try to add up all of those factors and identify the “most clutch” at bats.
Since play-by-play data from Santo’s era is now available (due to the work of Retrosheet volunteers), we will soon be in position to give a meaningful and objective answer to your question.
Until then, what I can tell you is that Santo hit .287 in his career with men on base, as opposed to .269 with the bases empty. He homered as often with men on base as with the bases empty. He did tend to fade late in the season, perhaps because he was playing every day, or perhaps related to his diabetes.
Q: What statistical software do you use?
A: Just Excel.
Q: I’ve played a fair amount of baseball in my day, but I’m more avid as a golfer. And I know that no matter how much I practice, there are some days when I have “it” and some when I don’t. Which is why I’m frustrated/confused when a manager replaces a pitcher who has been doing well in mid-inning just to get the “correct” right-hand/left-hand pitcher-batter match up. Are there any stats that prove right/left handedness of a pitcher-batter match up to be significant?
A: Over time, every hitter will hit better when he has the platoon advantage than when he does not. There may be an exception, maybe two exceptions. You see a lot of reverse splits or backwards splits in one-year data — lefties hitting better against lefties, etc. Over time, at least 99 percent of hitters are going to hit better when they have the edge, and certainly the difference is significant.
Q: Billy Beane, G.M. for the Oakland A’s, has made sabermetric stats a major part of his “value” philosophy when building a baseball team. He’s frequently said that his method will build regular season winners but it doesn’t seem to work in the playoffs. Do you think that this is simply a result of a small sample size or the wrong statistics being used, or is it something more fundamental about “unmeasurable” statistics, like the ability to perform under pressure and “heart?”
A: Oh, I thought people had stopped asking that. Blast from the past there. Look, there’s a lot of luck in winning in post-season. You’re up against a really good team, by definition, and you’ve only got a few days to get it right. It takes some luck.
Are there also types of players and factors that are helpful in that situation? Of course. It’s like asking a physics professor whether there is a God. Scientists don’t know anything more about whether there is a God than morons do, because it’s not a scientific issue. This isn’t something I can measure. It’s a matter of faith.
Q: It seems that in the field of sabermetrics, there is much more focus on their use in evaluating offensive rather than defensive production; is the reason behind this the fact that the measurement of defensive valuation is less complete in your field? One example is the Seattle Mariners’ continued use of Raul Ibanez in left field. He is a defensive liability.
A: In the 1870’s/1880’s, when the scoring system for baseball games was developed, the statistics invented for batters were well designed and specific, and as such they naturally evolved toward better and better results. The statistics invented for fielders were so awkward and sketchy that they weren’t really very useful, and therefore they never advanced. The official fielding stats today are basically the same as they were in 1885.
Since the beginning of sabermetrics in the 1970’s, vastly more effort has been put into studying fielding than was ever put into studying hitting. Until three or four years ago, not too much came out of that.
Three or four years ago, all of a sudden, a series of different sabermetric methods for evaluating fielders all began to converge on a common set of answers. If it was a basketball game between hitting stats and fielding stats, fielding stats used to be behind like 61-13, and now they’re behind like 64-47. It may be that not everybody has figured that out yet. But it’s no longer true that our ability to evaluate hitters is dramatically better than our ability to evaluate fielders, at least at the major league level.
Q: In baseball, and maybe in life, real change and real innovation comes only as a result of crisis or flux driven by external pressures. But baseball is awash in money, both players and owners seem relatively happy, and fans and their governments are heavily invested in M.L.B. as it currently exists through taxpayer funding of stadiums. Attendance is at or near historic highs.
I believe you’ve said (and I’m paraphrasing) that a sport that never changes quickly becomes boring and irrelevant. Do you see opportunities for the game to grow and change in the near future given its current state?
A: It’s not true that real change occurs only as a result of crisis or external pressure. Change occurs for those reasons, and for others. Innovation occurs in leisure at least as often as it does in panic.
Baseball changes enormously from decade to decade. This has always been true. There is something about the game that enables it to absorb changes and yet remain the same on some level.
It seems to me, intuitively, that the pace of change in the last ten to fifteen years is rather a rapid one, perhaps more rapid than is in our best interest. So much is different now — the inter-league play, the pitching strategies, the power in the game (home run power), the way that fans root (for players rather than for teams) — it’s just very different. Change is good. I hope we’re not changing so fast that we leave too many fans behind.
Q: Will the Indians ever win the world series?
A: Absolutely. In my lifetime. They will win because they are worthy of victory.
Q: I have been reading The Bill James Goldmine, and I keep finding myself going back to the same question. By your own admission, much of the data is raw, and you indicate no real sense of its application; yet you’ve published it. In the past, your legions of readers have taken your published materials and built off of it — expanding the application of the data and discovering new useful insights. With the publication of The Goldmine, were you hoping that the same thing would happen?
A: I am certain that it will happen. People take information and build knowledge. When you give them new information they will create new knowledge, absolutely and without question.
Q: How important are good-hitting pitchers to the success of an offense in the N.L.?
A: Exactly as important as good-fitting underwear on a long drive.
Q: Has looking at the numbers prevented you from actually just enjoying a summer day at the ballpark? Have we all forgotten the randomness of human ballplayers? By reducing players to just their numbers can we lose sight of the intangibles such as teamwork, friendships, and desire.
A: Does looking at pretty women prevent one from experiencing love? Life is complicated. Your efforts to compartmentalize it are lame and useless.
Q: I’ve heard it said that “there is no such thing as a pitching prospect,” mainly because of the unpredictable nature of injuries in young pitchers. Can statistical analysis be applied to prevent (or least minimize the chances of) injuries to these players?
A: We hope.
Q: Bill, how are you going to spend your time if/when you get tired of baseball analysis? Or asked another way … what’s the next challenge for you?
A: I’m pretty sure that if I was going to get tired of baseball it would have happened by now. But I am writing a book about famous crimes, if that answers your other question.
Q: Having been able to share your knowledge and opinion freely for so many years, how do you feel about restrictions that must now exist on what you say by working for the Sox? Do you feel part of the Red Sox family?
A: There are times when I wish I could speak a little more freely, but honestly, the same is true is for most everybody, isn’t it? Peter Gammons says (I hope I’m not misquoting him) that 80 or 90 perecent of what he hears is off the record for one reason or another. One must suppose that he wishes he could tell on us. But working for the Red Sox is very rewarding, and I hope the trade-off is worthwhile to my readers as well as to me. It’s certainly worthwhile to me.
Q: As advanced statistical analysis becomes more of a norm, will a general consensus emerge on the correct value of a player, or will teams have unique measures and calculations to find undervalued qualities (like O.B.P. has been)? How will player scouting change to fit this progression?
A: There will always be people who are ahead of the curve and people who are behind the curve. There will never be a shortage of stupidity. There will always be an advantage to being on or near the leading edge of the research.
Q: What do you feel is the biggest threat to the future of the popularity of baseball? And what do you see as the best opportunity for growth for the popularity of the game that M.L.B., the owners, etc. are not taking advantage of or pursuing?
A: The biggest problem (or threat) that we face is the poor state of amateur baseball for very young kids. Somehow, we’ve allowed highly competitive attitudes to seep down to six-year-olds and seven-year-olds, so that kids at very young ages are being taught to play the game “right” before they learn to love the game. It makes baseball seem like school — “I’ve got to do this right to please the coach.” We’re turning off millions of kids in a failing and misguided effort to accelerate the development of skills. Somehow, we’ve got to flip that back the other way, so that kids can learn to love playing the game.
Q: What unanswered questions (either baseball-related or not) are you thinking about right now?
A: Why does American society always perceive itself as becoming constantly more and more dangerous — and thus devote ever more and more effort to increasing security — even though almost all measurable dangers, including crime rates, have been falling throughout most of my lifetime? And … is this a good thing?
Q: Who is playing you in the movie version of Moneyball that’s in the works?
A: Meryl Streep. She’s having a little trouble with the accent.
Q: Do you play fantasy baseball?
A: Not at the moment. I have, though. I think the Commissioner’s office frowns on front office guys having fantasy teams. It creates the appearance of a conflict of interest, and, even though it’s a trivial conflict, one still has to respect that somebody might get the wrong idea.
Q: Does it typically make more sense for a team to draft the best available player they can, or to try and fill an organizational need?
A: Well, it’s sort of like — do you drive on the best road, or do you drive on the road that goes directly where you want to go? You drive on the best road that goes somewhere near to where you want to go for as long as you can, and then in the last five rounds you do what you have to do.
Q: Big fan since I started playing fantasy baseball in the 1980’s. 1) Do you attend the actual games and if so, are you able to separate all the numbers from just enjoying the “scene” (the crowd, the noise, the moment). 2) Did you play baseball as a kid and if so, were you into the numbers then as much as now or did your interest in the numbers come later? 3) Who was your favorite athlete when you were growing up?
A:1) I attend many, many games. I enjoy attending games, but I don’t cease to be who I am when I take my seat at the game. I see the game based on what I know, as you do and as everyone else does.
2) I did play baseball, but I wasn’t good. And honestly … I’m not interested in the numbers. Never was. That’s your perception of what I do; it’s not mine.
3) I was growing up for a long time. Wilt Chamberlain. Minnie Minoso. Jim Kaat. Gale Sayers. Len Dawson. JoJo White. Norm Siebern. Catfish Hunter.
Q: Should M.L.B. subsidize the use of wood bats in college baseball?
A: I’ll vote for it. I think the biggest problem is: where do you draw the line? There are 900-some college baseball teams, let’s say about 20,000 college baseball players. How many bats per player are we going to subsidize? Do we subsidize UC-Riverside and Egbert State the same? Also, I’m not sure there are that many trees.
Q: What would you do if you were named Commissioner of M.L.B.? What would you do now, if you were able to run the Hall of Fame?
A: On the first question, I’d shorten the schedule to get the season over with before it turns so bitterly cold in the North. On the second … I’d thank the writers for their service, and concentrate on developing a system that has checks and balances.
Q: Who are ten players in the Hall Of Fame that do not deserve to be there?
A: Fred Lindstrom, Jesse Haines, Tommy McCarthy, Lloyd Waner, George Kelly, Ross Youngs, Roger Bresnahan, Earle Combs, Jim Bottomley, and Chick Hafey.