Found this article from back in the spring about lineup construction:
by Dave Studeman
March 02, 2006
It's the latest craze on the Internet: constructing lineups (at least it was, until Barry Bonds dressed up as Paula Abdul). It all started a couple of weeks ago when Cyril Morong posted a regression analysis of how much to weigh On-Base Percentage (OBP) and Slugging Average (SLG) for each lineup position. Pretty quickly, everyone started using Cyril's analysis to construct lineups for their favorite teams.
The Pastime did it for Oakland, which inspired Ken Arneson to write a script to implement Cyril's findings, which inspired David Pinto to incorporate a lineup optimization tool on his blog, which further inspired many other blogs to apply the logic to their teams. Dan Scotto wrote a nice summary of the insights gained from Cyril's analysis.
Meanwhile, I was going around like a curmudgeon, telling people that a static regression model really shouldn't be used to construct something as dynamic as a lineup. I referenced Tom Ruane's excellent article on Retrosheet, which uses something called Markov Chains to evaluate the ideal lineup and concludes that lineup composition just doesn't much matter.
That seems hard to believe, doesn't it? It really doesn't matter if the pitcher bats first or ninth? I don't buy it, either. Plus, it's fun to talk about lineups. As baseball fans, they give us something to tinker with, and they provide important clues regarding how the manager thinks. I'd guess that many baseball fans love to debate their favorite team's lineup.
Luckily, I received a package Monday that contained a book called The Book. I've been waiting for The Book for about two years or so, when I first heard that Tangotiger (or Tom M. Tango ) and MGL (Mickey, Mitchel, UZR-guy) were planning to write a book together. They pulled Andy Dolphin into the effort, and the three of them took their sweet time writing it. The wait was worth it. They have written a book that every baseball manager and general manager should read, perhaps the best book of its kind since The Hidden Game of Baseball. And they included a chapter on lineup construction.
I'm not quite done reading The Book, and I'll have to re-read several sections a few times. But I paid particular attention to lineup construction, and I thought I'd share some of The Book with you. The Book is filled with concise, logical analyses that culminate in strategic guidelines, called "The Book Says." Here's the most important strategic guideline for lineup construction:
The Book Says:
Your three best hitters should bat somewhere in the #1, #2 and #4 slots. Your fourth- and fifth-best hitters should occupy the #3 and #5 slots. The #1 and #2 slots will have players with more walks than those in the #4 and #5 slots. From slot #6 through #9, put the players in descending order of quality.
I'm only scratching the surface of the lineup chapter with this quote, but there's obviously enough meat here to fill 10 articles. Don't worry; I'm only going to write one today. For this article, let's apply The Book's guideline to some real teams, using Baseball Prospectus' statistical projections for next year.
I'll start with Oakland, since they started this whole thing. Just to make things easy, I'll assume that Jay Payton bats instead of Frank Thomas. Here's what I came up with:
Name BA OBP SLG Bats
Bradley .279 .355 .447 S
Johnson .272 .353 .462 L
Crosby .269 .346 .451 R
Chavez .271 .354 .479 L
Swisher .252 .347 .453 S
Ellis .283 .351 .426 R
Kotsay .277 .332 .414 L
Payton .267 .312 .415 R
Kendall .270 .333 .338 R
Dan Johnson in the second positon doesn't compute, does it? Perhaps the most important thing The Book tells us is that we should put our stereotypes of leadoff and #2 hitters aside.
First, the guys in the first two slots bat most often during the year; why waste those appearances on below-average hitters, or even average ones?
Secondly, The Book's key analysis was an assessment of the potential run value of each batting event in a lineup. They found that hits by the leadoff and second batters will typically generate more runs than hits from any other lineup position (other than cleanup). Hard to believe? I think most fans underappreciate the importance of power in these first two positions. These guys are only guaranteed to start an inning once, the first inning. Many other times, particularly in the American League, they will bat with runners on base.
In a nutshell, the first two positions bat most often and their hits create more runs than those in most other positions. This is why The Book recommends that you place two of your three best hitters in the first two lineup positions. Through a simple OPS rating, Chavez, Bradley and Johnson are projected to be Oakland's three best hitters.
If you're an A's fan, you probably think Mark Ellis is going to hit better than .283/.351/.426. If so, putting him in the leadoff position might work. In fact, the A's have such a balanced lineup that it's very hard to construct a "wrong" lineup. Perhaps the most important thing manager Ken Macha should do is make sure he has no lefty or righty batters in back-to-back lineup slots. By avoiding consecutive players batting from the same side, he will have a strategic advantage late in the game against opposing relief specialists.
The Los Angeles Angels of Anaheim have a very unbalanced attack, featuring Vladimir Guerrero and a bunch of hopefuls. Last year, Vlad batted third about two-thirds of the time and fourth the rest of the time. Here's what The Book suggests:
Name BA OBP SLG Bats
Rivera .277 .328 .432 R
Anderson .283 .314 .450 L
Kennedy .271 .332 .389 L
Guerrero .314 .376 .546 R
Kotchman .270 .328 .398 L
Figgins .274 .334 .383 S
Erstad .264 .314 .364 L
Cabrera .262 .307 .368 R
Molina .227 .273 .339 R
According to The Book, Vlad should bat cleanup. In fact, (on the surface), The Book suggests batting Adam Kennedy third! I may be applying The Book too literally here, but there is some method in the madness.
In the 1988 Baseball Abstract, Bill James found that teams score the most runs in the first inning and the fewest runs in the second. This makes sense when you think about it, because lineups are structured to score the most when the leadoff batter bats first. But he also found that the overall average of the two innings was less than the average of every other inning. In other words, the typical lineup was overemphasizing the first inning at the expense of the second inning.
One of the problems is that teams often put their highest OBP batter in the third position, but the #3 spot is the one LEAST likely to lead off the second inning. James said it, others agreed, and The Book confirms it. In addition, The Book found that the #3 hitter has more plate appearances with two out and nobody on. So the run value of every hit (except the home run) is lower in the third position than in any other of the top five positions. That's why they recommend putting your fifth-best hitter in the three spot. Whether or not you believe that, the Angels should bat Vlad fourth.
You may have also noticed that I have Chone Figgins batting sixth instead of leading off. Given the way the Angels approach offense (singles, baserunning and Vlad), this might not make sense. In fact, the Angels have such a skewed distribution of talent that several of The Book's guidelines should probably be adjusted for them. But batting your top basestealer in the #6 spot makes sense in a lot of cases. Consider the New York Mets.
The Mets will probably have four great hitters in their lineup this year, if they stay healthy (Delgado, Wright, Beltran and Floyd) and these guys should obviously be placed in four of the top five slots. Their fifth-best hitter will probably be the Nady/Diaz platoon in right field. Their next-best hitters are projected to be Reyes, Lo Duca and Matsui (or whoever plays second). As you can see, that leaves Reyes, who led the NL in stolen bases last year, in the sixth position, which is probably the best place for him.
First of all, Reyes will almost certainly have a lousy OBP this year, no matter how much he works at it. Secondly, a basestealer for the Mets will have more value batting sixth instead of first. Why? According to The Book, there are a couple of reasons:
A stolen base has the most value when it's done in front of singles hitters who don't strike out too much. Lo Duca and Matsui may be the two most prolific singles hitters the Mets have in 2006, and Lo Duca doesn't whiff very often.
A caught stealing does much more damage with a Carlos Delgado or David Wright at the plate than a Paul Lo Duca or Kaz Matsui.
The logic seems overwhelming to me. Bat Reyes sixth.
Let's return to the American League West one more time and look at one more contending team, the Texas Rangers:
Name BA OBP SLG Bats
Dellucci .261 .363 .495 L
Blalock .282 .348 .510 L
Young .306 .355 .471 R
Teixeira .289 .371 .561 S
Wilkerson .263 .362 .473 L
Mench .278 .341 .480 R
Nevin .270 .325 .456 R
Kinsler .270 .328 .451 R
Barajas .249 .293 .434 R
Mark Teixeira should bat fourth, not third, for the same reason Vlad should. Brad Wilkerson, projected to bat leadoff, fits best into the fifth slot, though he's not out of place as a leadoff hitter either. This lineup makes sense except for one thing: there isn't enough balance between left-handed and right-handed batters. When you have David Dellucci and Hank Blalock batting back-to-back, you leave yourself open to a two-batter LOOGY. LOOGY stands for Left-handed One Out GuY, which is a misnomer when he can stay in the game to face two batters.
One other thing of note: the #3 hitter typically has the most plate appearances with a runner on first and the hole between the second baseman and the first baseman open, so you would like a left-handed batter in the three spot if at all possible. As a result, you might want to switch Blalock and Young in the Rangers' order, grudgingly.
The Book endorses another off-beat strategy: the second leadoff hitter. Here is what The Book Says:
The second leadoff hitter theory exists. You can put your pitcher in the eighth slot and gain a couple of extra runs per year.
You gain more by having a good hitter bat directly before your top hitters than you lose by giving your pitcher a few more plate appearances each year. I'm not talking about Jason Marquis or Dontrelle Willis. I'm talking about your bad-hitting pitchers. Move them up a spot. In fact, this strategic guideline argues AGAINST moving Marquis and Willis up in the order.
A couple of extra runs doesn't sound like a lot, but if you follow theses guidelines, you could gain 10-15 runs over a full season. About a win a year. And it wouldn't cost you anything except grief from your local media.
In the beginning of this article, I mentioned Cyril Morong's analysis and all the subsequent attempts at lineup construction. Did you read Dan Scotto's review of Morong's analysis? If not, you may want to. Dan's summary and The Book are actually very much in sync. They both emphasize the importance of good hitters upfront, and they deemphasize the strategy of batting your best hitter third. What's more, they both endorse the "second leadoff hitter" strategy. I'll admit that I was skeptical, but The Book validates many of Dan's points.
When two entirely separate approaches arrive at similar conclusions, people should listen. Think they will?