View Full Version : The first sabermetrician

11-03-2006, 04:48 PM
I came across this article from '04 that I had saved and thought it might be interesting to anyone who missed it.

Looking Beyond Batting Average

Published: August 1, 2004


Branch Rickey, a baseball executive just eight years removed from signing Jackie Robinson, called it "the most constructive thing to come into baseball in my memory."

Fifty years ago this week, a 10-page spread in Life magazine, then the nation's most widely read periodical, introduced America to the science of baseball statistics. Readers opened their Aug. 2, 1954, issue to a sprawling feature titled "Goodby to Some Old Baseball Ideas," with Rickey standing professorially at a faux blackboard, pointing at his heretofore secret equation, which, he wrote, "reveals some new and startling truths about the nature of the game."

That vertiginous equation was as alarming as it was befuddling to an audience quite content with batting average, thank you. But as preposterous as it still appears on its face, the formula, and the long article Rickey wrote to explain it, actually used concepts that some modern major league clubs are still learning to appreciate.

Getting his first look at the equation last month, Yankees General Manager Brian Cashman said: "Wow! The guy was generations ahead of his time."

The formula is easier to parse than it looks at first glance. It assesses a team's overall strength by noting the difference between its offense (the first half of the equation) and its pitching and defense (the second half). The first group of terms, upon closer inspection, are shockingly similar to methods used today.

The first term is what we now call on-base percentage. The second, which Rickey called isolated power, is a modification of slugging percentage. The third measures how often runners score per time they reach base. Batting average is nowhere to be found. So, as baseball traditionalists cringe at today's most popular "new" metric among the stat-inclined - on-base plus slugging percentage (OPS) - its roots stretch back to Branch Rickey.

The second half of the equation evaluated pitching by opponents' batting average, walk percentage and more esoteric devices. But Rickey spent most of his (undoubtedly ghost-written) article trying to wean readers off rating hitters by tried-and-true batting averages and runs batted in.

"If the baseball world is to accept this new system of analyzing the game - and eventually it will - it must first give up preconceived ideas," wrote Rickey, at the time the top executive of the Pittsburgh Pirates. He continued, "Two measurable factors - on-base average and power - gauge the overall offensive worth of an individual."

Chances are that the formula did not originate with Rickey but with his right-hand statistics man, Allan Roth. (Yes, although traditionalists shudder at how modern major league teams are hiring stat gurus, Rickey did so as early as the 1930's while running the St. Louis Browns.) Roth, hired in 1947 by Rickey while he ran the Brooklyn Dodgers, sat in the stands and kept his own specialized statistics: performance against left-handed and right-handed pitchers, with runners on base, by ball-strike count and more - all to discover any edge Brooklyn could exploit.

"Baseball is a game of percentages," said Roth, a Montreal native who went on to a long career figuring numbers for NBC's "Game of the Week." "I try to find the right percentage."

As for the formula that later appeared in Life, it did little within the baseball industry but underscore Rickey's considerable ego. (He was known for his many maxims - including "Luck is the residue of design" - but another might as well have been, "Don't let anyone guess how smart you are when you can simply inform them.")

The longtime journalist Leonard Koppett, then a young writer for The New York Herald Tribune, told me not long before his death last summer: "Ah, the famous equation. There was no response. No one followed it."

But for some other readers of Life, the 10-page spread introduced them to a new way of looking at baseball. Most of them were young, mathematically inclined boys who had played in the sandbox of baseball statistics and welcomed the idea of something more sophisticated.

One of those was an incoming Duke University senior named Tal Smith. He had suspected that batting average was overrated and that other contributions, like walks and extra-base hits, were more valuable. Reading Rickey was a revelation.

"I'd never seen anyone refer to the game in this way; it had always been romanticized, not analyzed," Smith recalled. "I devoured it. It was an advanced course in a subject I was already interested in."

Smith later began his career in baseball front offices and soon became a pioneer in the use of statistics, particularly in salary arbitration cases. He is now president of the Houston Astros. As with so many revolutions, the open-minded youth of one generation became the decision makers of the next.

Rickey suspected as much as he preached from his pulpit in Life. He didn't expect his stodgy industry to follow his lead, but he foresaw the day when what we now call OPS would move from cult to currency.

"They will accept this new interpretation of baseball statistics eventually," Rickey concluded. "They are bound to."

11-03-2006, 07:31 PM
I vote for Henry Chadwick (http://en.wikipedia.org/wiki/Henry_Chadwick).

11-03-2006, 07:43 PM
F.C. Lane.

11-03-2006, 07:52 PM
F.C. Lane.


The Base on Balls

Why Should the Records Ignore This Powerful Factor in Brainy Baseball?

By F. C. Lane

(This was originally published in Baseball Magazine, March 1917-Posted by Cyril Morong-Special thanks to Pete Palmer for getting a good copy of the original from The Baseball Hall of Fame and sending it to me)


In 18771 those potent intellects which govern baseball records decided to do something handsome for that orphan skill of the dope sheets, the base on balls. Swayed by the noble impulse of generosity they gave their benevolent feelings full sway and decided that henceforth a base on balls should be a hit, fully equal to a vicious triple or2 a fence crashing home run. There followed a frolicsome period for the batting kings, a period where the man who couldn’t bat three hundred was a chump, the four hundred hitter merely good, and the chief swatter of the bunch, Tip O’Neill, approached the fabulous mark of .500, a record which has never been equalled.

Alarmed by rocketing averages and finding that the dope was becoming fairly glutted the same potent intellects who had been responsible for this wild orgy of batting reversed their august decision and declared that a base on balls was of no account, generally worthless and henceforth even forever should not redound to the credit of the batter who was responsible for such free transportation to first base.

The magnates of that far distant date evidently had never heard of such a thing as a happy medium. The fact having gradually penetrated their well constructed skulls that a base on balls was worth something they immediately rushed in where angels feared to tread, and decreed that it should take rank with all the other hits, including the extra base wallop. Having discovered, in the course of time, that this act was rather rash they at once scrambled back to the zone of safety and refused to give the unfortunate base on balls any notice whatever. “Whole hog or none” was the noble slogan of the magnates of ’87. Having tried the “whole” they decreed the “none” and “none” it has been ever since.

The base on balls is indeed an outcast and a stranger in the records. The most the scorers do for the homeless wanderer is to ignore it utterly. The batter gets no credit for getting a base on balls either through his wits or through respect for his batting powers. But magnanimously, the fact that he is given a pass doesn’t react against him. He isn’t fined or anything like that. His voyage to first base merely doesn’t appear at all, isn’t called a time at bat, plays no part whatever either for or against his batting average.

“The easiest way” might be adopted as the motto in baseball. It was simpler to say a base on balls was valueless than to find out what its value was. The latter process involved some thought and work and usually those who have had the matter in charge have been unable to do the one an unwilling to do the other.

There may have been a time when wildness on the part of the pitcher was the main cause of a base on balls. But that date, if it ever existed, has gone forever. The requirements of the modern game demand almost perfect control. When Larry Cheney of the pennant winning Brooklyns was shunted from the Cubs he complained bitterly of this very policy. “It makes no difference how much stuff a pitcher has,” said Larry, “or how hard it is for the batters to hit him. If he gives a base on balls, yank him out of the box, and if he gives a number, fire him.”

“The strain of pitching nowadays is much greater than it was years ago,” said Dutch Leonard of the Champion Red Sox, “and it gets worse every year. I never saw the old timers pitch but I have looked into the records, and I know. Nowadays if a pitcher weakens to the extent of giving a base on balls the manager is right on his toes and if he pitches a few extra balls it is curtains for him. Pitchers don’t get knocked out of the box anymore. They don’t get a chance.”

In short, the whole progress of modern baseball has tended to eliminate wildness on the part of the pitcher. Wildness, to be sure, hasn’t entirely disappeared and never will. But it is a vanishing fraction, it grows less every year in the long run, and it has ceased to be the main factor in the base on balls.

In 1916, Grover Alexander, a pitcher who has good control, gave 50 base on balls. He took part in 48 games. How often is it good policy for a pitcher to pass a dangerous batter with men on bases? Alexander, redoubtable twirler that he is, would fear batsmen less than most but Alex is crafty and takes no unnecessary chances. No one knows how many times he passed a batter intentionally, but it wouldn’t be beyond reason to account for a large number of his fifty passes in this manner. Certainly if they could not thus be accounted for, the ability which certain batters possess of outguessing the pitcher or waiting him out, would well account for the balance. Alexander is human. Perhaps a few of his fifty passes were given through sheer wildness. But if so they were very, very, few.

Rudolph of Boston allowed 38 bases on balls in 41 contests. How many of them were due to wildness? No one knows, but certainly not many.

Turning to the batting lists we find that Tris Speaker received 82 bases on balls. If you want to know the reason look at his batting average. Ty Cobb received 78. Every one knows why. Eddie Collins got 86. Collins has a double toe hold on the base on balls column. He is known to be a good batter and he is also a past master in the art of waiting them out. A grand exponent of the wait ‘em out policy is Hooper, as every pitcher knows. Bert Shotton of the browns got 111 passes. Doesn’t he deserve any credit for this?

Some time ago John Evers, one of the brainiest stars who ever sat upon a players’ bench, had a heated discussion on the general subject of batting averages. “I pay no attention to batting averages,” said Evers, “and no other sensible person pays much attention to them. They tell little of a player’s ability. Take my own case, for instance. I will talk freely about that for I know what I am talking about. If I were talking about someone else perhaps I would be guessing. In my own case I will say that I am convinced that I could usually have hit thirty points higher than I did hit, if I had made a specialty of hitting. Some lumbering bone head who does make a specialty of hitting and nothing else may forge well across the .300 line and everybody says ‘what a great batter!’ The facts of the case are the bone head may have been playing rotten baseball when he got that average and someone else who didn’t look to be in his class, might be the better hitter of the two.

“Jimmy Sheckard didn’t use to hit so very high, according to averages. But if you remember he used to get to first an awful lot of the time. He did this because he made a habit of waiting them out. He didn’t try to hit except when he was in a hole and was forced to do so. His whole system of play was based on another policy. He believed that a good share of the time he would be doing his club a better service by trying to wear down the opposing pitcher and get him in the hole all the time than he would be doing by hitting the ball. Of course, there are plenty of times when he there is nothing like the solid single. But there are plenty of other times when the player at the plate should focus his attention on trying to fool the pitcher and shouldn’t even try to hit unless he is in the hole. In my own case I have frequently faced the pitcher when I had no desire whatever to hit. I wanted to get a base on balls. That was what I was working for. If I didn’t get it my average suffered and if I did get it my average wasn’t benefited in the least. That is why I say the averages mean nothing . They don’t give a player credit for playing brainy ball. They put a premium on pure slugging.”

Evers indictment is a just one. The batting averages give scant justice to some of the brainiest players who ever lived. Eves himself was not a three hundred hitter in any true sense of he word. Occasionally, in his long career, he reached that mark. Once he soared away above it. But in the main he hit for many points less. Had he devoted his entire attention to hitting in so far as the manager would allow him to do so, there is not a question in the world that he would materially have bettered his mark. His own estimate of thirty points would seem a conservative one.

And yet, Evers, when he neglected to hit as a swell as he was able, was sacrificing his own personal record to the good of his club, was playing a far brainier brand of ball, in short, was batting in better form than he would have done had he constantly hit for .300. Is there not something wrong with a system which permits errors as grave as this?

Long associated with Evers was Edward Reulbach, a pitcher who the Trojan often claimed possessed a brain as shrewd and crafty as ever a pitcher owned. Reulbach, viewing the subject from another angle, the pitcher’s angle, said, “As a pitcher I would say that I would rather have a batter hit my offerings safely than to work me for a pass. I believe this would be the opinion of all other pitchers.” If a fast ball or curve is hit why it is only the fortunes of war. The pitcher grits his teeth and says, “I will bet they won’t hit the next one.” And he buckles down to work. But if after he has exhausted all his craft and skill, the pitcher is finally worked for a base on balls he experiences an entirely different feeling. He has been outguessed or he has temporarily lost control. If he has been “worked” his confidence to outguess the batter suffers a shock. If he has lost control he is strictly up against it. For control is everything to the pitcher. In any case he faces the next batter with much less confidence after he has passed a player than he would do had that man gained first base by a hit. The effect of a base on balls, in short, is more damaging to the pitcher’s nerve than is an ordinary hit.

Here are two players, one of the wisest pitchers who ever lived and one of the greatest all around stars who both agree as to the extreme desirability of the base on balls. Is there, then, no credit in the records for the man who is usually proficient at the act of getting passed?

We have seen how the base on balls was long ago recognized to the extent that a wholly exaggerated and fictitious value was given it. We have also seen that the value was later withdrawn and the pass thenceforth, utterly neglected. The cause of this neglect lay in the inherent difficulty in figuring the value of the pass. But is an attempt to discover this value foredoomed to failure?

If perfection is sought for the answer must be, “Yes, it is, impossible to determine the exact value of the base on balls.” But if the aim be merely an approximate value, (such things as form the ground work of all statistics) the answer is emphatically, “No, it is not impossible to approximate the value of the base on balls.”

In the two proceeding articles of this series the values of extra base hits were thoroughly discussed. Records of 1,000 hits made last summer indicated that a certain relationship exists between the hits which might be expressed by the following formula:

Suppose, for instance, that a home run should grade as 100%. Obviously, the greatest possible achievement the batter can attain, it should be so graded. On such a basis, then the triple would rank as 74.1%, the double as 50.6% and the single as 29.4%. In other words, the single would be worth 29.4% of a home run, or about three-tenths as valuable.

Now, while keeping tabs on 1,000 hits, records were also kept on bases on balls given during those games. They number 283.

Let us assume for the moment that these bases on balls were all earned precisely as hits are earned, either through the ingenuity of the batter or the pitcher’s respect for his batting powers. We might, then, readily apply precisely the same course which found for us the value of hits. The base on balls has three values. First, its value for the player who receives it. The pass enables him to begin his journey around the hassocks and advances him one-fourth of the distance to the required haven, home plate. The pass has also a secondary value in the influence it exerts towards advancing base runners already on the bases. And it has a third value through the instrumentality of the fielder’s choice. The player who receives the pass may be forced by the next man at second base but the batter who forced him may reach first safely on the play. He is, then, clearly indebted to the original occupant of that sack for his start in business.

It was these factors which determined the comparative values of singles, doubles, and the other extra base hits, in our previous articles. The basis of computation in every case, was the run. In other words, the comparative value of the hit depended upon its comparative influence in scoring runs. It was discovered through the process outlined above that a single was worth a little less than half a run, that a double was worth more than three-quarters of a run, a triple rather more than a run, and a homer at least a run and a half.

Pursuing the same course, we find that the three inherent values of the base on balls size up as follows:

Of the 283 persons who received free transportation in our statistics, 142 or just about half advanced to second base; 92 or rather less than one-third reached third base safely, while 64 or slightly less than one-quarter finally scored.

We find, on examining the statistics, that twenty players reach first on fielder’s choices, at the expense of a previous occupant who had received a free pass. Of these twenty dead heads only two finally scored, a rather low average but one which might have been expected, owing to the fact that at least one man, and very often two, were out when such a dead head got to first. The number of runs actually driven by the base on balls was relatively larger than we had supposed, though not in itself very great. Six runs were, according to our statistics, forced across the rubber by a base on balls.

Adding our three values together, we find that 283 passes netted the side which made them a total of 72 runs. Dividing 72 by 283 we find that the base on balls was, on the average, worth 25.4% of a run.


Of the 283 bases on balls examined

142 or 50.2% reached second base,

92 or 32.1% reached third base,

64 or 22.6% scored,

6 runs were forced in by passes,

2 runs were scored by players who

reached first on a fielder’s choice

at the expense of a passed player;

283 bases on balls netted 72 runs.

Dividing 72 by 283 we find that the

average value of a base on balls in terms

of runs is 25.4% of a run.

Employing a similar method we dis-

covered in previous articles in this series

the comparative values of all hits from

singles to home runs. Grouping these

values in order, allowing to the home run

as the most important of all, the standard

value of 100%, we find the following

general comparison:

Home run …………100.0%

Triple……………… 74.1%

Double…………….. 50.6%

Single……………… 29.4%
Base on balls………. 16.4% ___

We had previously discovered that a single was worth 45.7% of a run. Its greater value resulting from the vastly larger number of runs which were driven home by such a hit. Reverting to our former table and allowing to a home run the standard 100%, we find the following comparative values.3

The one defect in the above method of determining the comparative value of the base on balls is the assumption that it is always earned. Such an assumption is admittedly erroneous. Three causes and three alone contribute to a pass. It is given voluntarily to the batter as a tribute to his known ability as a slugger. It is earned by the batter’s ability to outguess the pitcher or to wait him out. Or it is the result of plain wildness on the pitcher’s part. In two of those three cases the batter has earned the pass. In one he has merely been lucky in being favored by the pitcher’s wildness.

To just what extent each one of these three contributing causes account for the total number of bases on balls issued in the major leagues it is impossible to say. But it is safe to assume that by far the most important of the three is the ability of the batter to outguess the pitcher or wait him out. Neither voluntary passes nor wildness account for so many passes. Obviously, more than half the passes given in the course of the year are earned and should be credited to the batter.

Perhaps it would be the part of wisdom to err on the side of generosity. The pass has long been neglected altogether. If we can not determine its value exactly let us at least not set that value at a lower mark than it deserves. Would it not be just, in the long run, to suppose that passes which are issued through sheer wildness are likely to affect the average of all batters in the same general degree?

Look at the matter as you will, the present system of ignoring the base on balls puts a decided premium on sheer blind slugging and discourages brainy inside baseball of the Evers type. Such a player as Jimmy Sheckard was playing the highest type of ball at the expense of his own personal record. For the policy of wait them out is most destructive to a batting average.

The base on balls should logically rank with the hit, though obviously on a much lower plane. The old scorers were wrong to call it the equal of a hit just as the present scorers are wrong in calling a single the equal of a home run. But the base on balls is worth something, and is usually earned by all round4 batting ability of a relatively high order. In light of present researches would it be far amiss to claim for the base on balls an offensive value half as great as we should allow a single?


1. It was actually 1887, but 1877 was in the original article. Lane later refers to the year correctly as 1887.

2. The original had “of” not “or.” I made the change.

3. The table was shown twice in the original article. Since it is directly above, I did not put it in a second time.

4. The original does read “round” and not “around.”

11-06-2006, 10:27 PM
I couldnt stand to read through the extensiveness of some of the replies but id just like to point out that one of my favorite shows, Numb3rs, on CBS has a show dealing with the pythagorean expectations using sabermetrics to statistically analyze the estimated winning percentage of a team in a series of games based on the runs scored and runs allowed. Formula used is

W(winning percentage)= S squared(runs scored)/
S squared + A squared (runs allowed)

The episode " Hardball" premieres this friday at 10 on cbs

11-07-2006, 09:56 AM
For anybody interested in the history of baseball stats, I highly recommend The Numbers Game, by Alan Schwarz.

11-07-2006, 10:05 AM
For anybody interested in the history of baseball stats, I highly recommend The Numbers Game, by Alan Schwarz.

I second this recommendation. As someone who is a bit of a numbers guy and only knew a bit about the history of stat analysis in baseball, I found this book very interesting. And not dry at all, for you number-phobes out there.

11-07-2006, 10:06 AM
I second this recommendation. As someone who is a bit of a numbers guy and only knew a bit about the history of stat analysis in baseball, I found this book very interesting. And not dry at all, for you number-phobes out there.

That's because Alan actullay knows how to turn a phrase.

11-07-2006, 12:35 PM
Slightly off topic but thought you would like it.

Fun with Bill James numbersposted: Monday, November 6, 2006 | Print Entry

The first best thing that comes after the World Series is "The Bill James Handbook" from ACTA Sports. Not only is there nothing close to it in terms of briefcase usefulness, but there are so many things that are pure fun: number of pitches over 95 mph, a chapter defining "the manufactured run," ballpark factors, managerial tendencies, career projections and assessments. And since Alex Rodriguez has at least a 25 percent chance of breaking the home run, RBI and runs scored records, that might put the reins on your desire to trade him for Mike Maroth.
The first thing is to track career trend lines. Now, one can take the American League's best player in 2006, Grady Sizemore ...

2004 43 6 2 4 56 21 14 .739
2005 158 37 11 22 310 101 52 .832
2006 162 53 11 28 349 121 78 .909

Or value the consistency and reliability of a human metronome such as Greg Maddux, who in 11 seasons since turning 30, never has started fewer than 33 games, or more than 35.

Some names and numbers worth noting from the handbook of the man who changed the way we all judge baseball. Note Barry Bonds, Jermaine Dye, Aramis Ramirez, Hank Blalock, Billy Hall ...

PLAYER 2004 2005 2006
M. Anderson .648 .707 .867
R. Aurilia .667 .712 .867
Baldelli (2003, '04, '06) .742 .762 .872
Bartlett .247 .651 .760
Betemit .401 .794 .795
E. Brown .583 .804 .805
C. Burke .259 .677 .765
Cameron .798 .819 .837
Crawford .781 .800 .830
Crede .717 .757 .829
DeJesus .662 .805 .810
Dellucci .583 .804 .805
DeRosa .613 .764 .813
Dye .793 .845 1.027
Hall .650 .837 .898
Hawpe .722 .853 .898
Hinske .722 .753 .840
Helms .692 .814 .965
Jeter .823 .839 .900
N. Johnson .757 .887 .948
R. Johnson .700 .744 .860
C. Jones .847 .968 1.005
Kearns .740 .785 .830
Nady .717 .760 .790
Ortiz .983 1.001 1.049
Redmond .656 .742 .778
Reyes .644 .686 .841
F. Sanchez .316 .736 .851
Sizemore .739 .832 .909
Soriano .808 .821 .911
Weeks .536 .727 .767

PLAYER 2004 2005 2006
Abreu .972 .879 .886
Adams .887 .708 .601
Berroa .693 .680 .592
Bigbie .768 .742 .601
Blalock .855 .749 .726
Bonds 1.421 1.071 .999
Burnitz .815 .757 .711
Burroughs .713 .617 .558
Casey .915 .793 .727
Castilla .867 .722 .578
Chavez .898 .795 .786
Dunn .958 .927 .855
Edmonds 1.061 .918 .821
Erstad .746 .706 .605
A. Everett .702 .654 .642
L. Ford .827 .715 .599
L. Gonzalez .866 .825 .796
J. Guillen .849 .817 .694
Hairston .773 .704 .523
Helton 1.081 .979 .880
J. Lopez .873 .780 .683
Y. Molina .685 .653 .595
Mora .981 .822 .734
Nixon .887 .803 .767
A. Ramirez .951 .926 .913
Uribe .833 .713 .668
Varitek .872 .855 .725
Wilkerson .872 .756 .728

Innings Pitched
• Rich Harden, 189 2/3, 128, 46 2/3; Josh Beckett, 107 2/3, 142, 156 2/3, 178 2/3, 204 2/3; Gil Meche, 127 2/3, 143 1/3, 186 2/3; Kevin Millwood, 141, 192, 215; Ben Sheets, 237, 156 2/3, 106 (ERA 2.70, 3.33, 3.82).

• Matt Clement, 3.68, 4.57, 6.61; Randy Johnson, 2.60, 3.79, 5.00; Geoff Geary, 5.44, 3.72, 2.96; LaTroy Hawkins, 1.86, 2.63, 3.83, 4.48; Odalis Perez, 3.25, 4.56, 6.20; Oliver Perez, 2.98, 5.85, 6.66; Scott Kazmir, 5.67, 3.77, 3.24


Career Assessments
• 756 home runs: Bonds -- 97 percent chance of reaching the mark, Rodriguez 31 percent, Albert Pujols 22 percent, Andruw Jones 16 percent

• 2,298 RBI: Rodriguez 27 percent, Pujols 16 percent, A. Jones 12 percent, Manny Ramirez 12 percent

• 2,296 runs scored: Rodriguez 25 percent, Pujols 14 percent, Bonds 14 percent

• 793 doubles: Miguel Cabrera 15 percent

• 4,000 hits: Derek Jeter 6 percent, Cabrera 5 percent, Rodriguez 5 percent

• Projected career homers: Bonds 884, Adam Dunn 618, Ryan Howard 711, Vladimir Guerrero 633, Troy Glaus 539, A. Jones 677, Pujols 867, A. Rodriguez 772, Jim Thome 619, Jason Tyner 0.


Pitches 95 mph+
• NL: Brad Penny 817, Brad Lidge 574, Billy Wagner 541, Jonathan Broxton 442, Jorge Julio 437. Tells you something about NL starters.

• AL: Beckett 1,072, Justin Verlander 992, Felix Hernandez 950, Joel Zumaya 884, Daniel Cabrera 834