What enough of a sample size for you?
Hitter starting their career?
Innings for a pitcher?
Head to head pitching/hitting match ups?
How good a team is?
What enough of a sample size for you?
Hitter starting their career?
Innings for a pitcher?
Head to head pitching/hitting match ups?
How good a team is?
What enough of a sample size for you?
Hitter starting their career? 1000 PA's
Innings for a pitcher? 400
Head to head pitching/hitting match ups? 40
How good a team is? Pythag after 80 games
But that's just my 2 cents.
Follow me on twitter: http://twitter.com/ShaneHorning
It all depends.
Starting their career.... what did their minor league career and scouting report say? How are the peripherals?
Innings for a pitcher.... depends on what question I am trying to answer.
Depends. If a guy is 10-10 off of a guy, it is enough of a sample. If he is 3-10, it isn't.
A team.... that depends. Do the team stats look like they are right or is the team sporting a .340 BABIP? Is the team relying on pitching with a team BABIP against of .250?
Hitter starting their career? 1200 plate appearances
Innings for a pitcher? 300 innings
Head to head pitching/hitting match ups? 1200 plate appearances (pitcher/hitter matchup stats are totally worthless)
How good a team is? 162 games of pythag
The larger the sample size the more certainty you have. You can still develop an estimate of a player's value with a smaller sample size but you would have a very large margin of error which shrinks very slowly as the sample size grows.
Last edited by AtomicDumpling; 08-26-2013 at 04:02 AM.
Even when the sample sizes are small, pitcher/hitter matchup stats are not totally worthless.
Below are Choo's batting lines vs Reds pitchers when he was an Indian.
Against Arroyo, 8(3 doubles 4 homers) for 14.Code:PA AB H 2B 3B HR RBI BB SO BA OBP SLG OPS Bronson Arroyo 15 14 8 3 0 4 7 1 0 .571 .600 1.643 2.243 Homer Bailey 11 8 3 0 1 0 0 3 2 .375 .545 .625 1.170 Mat Latos 7 7 3 1 0 1 2 0 2 .429 .429 1.000 1.429 Mike Leake 6 5 3 1 0 2 2 1 1 .600 .667 2.000 2.667 Sam LeCure 5 2 0 0 0 0 0 3 0 .000 .600 .000 .600 Alfredo Simon 3 3 1 0 0 0 1 0 2 .333 .333 .333 .667 Jonathan Broxton 2 1 0 0 0 0 0 1 0 .000 .500 .000 .500 Aroldis Chapman 2 2 1 0 0 0 0 0 0 .500 .500 .500 1.000 J.J. Hoover 2 2 0 0 0 0 0 0 2 .000 .000 .000 .000 Logan Ondrusek 1 1 0 0 0 0 0 0 0 .000 .000 .000 .000 Total 54 45 19 5 1 7 12 9 9 .422 .519 1.044 1.563
Was Choo just lucky? No.
Last edited by junkhead; 08-26-2013 at 07:08 AM.
Mostly agree with this. I think there are definitely Hitter/Pitcher match-ups that should be avoided/exploited depending on your perspective, but I'm not sure the Stats are where to look for that information.
Less than these still can give clues, but I generally am a believer that first year stats are not to be trusted when projecting a guy going forward and second year stats may actually mislead the opposite direction. I think the third full season tells the tale. Part-time stats can be skewed by managing a guys match-ups to where he succeeds (platoons, pitcher type, etc.). They can possibly tell something about how to match him up, but as we've repeatedly seen (Nunnally, Stynes, Heisey, Paul, Janish, etc), have little relationship to how a guy would perform on a daily basis.
"All I can tell them is pick a good one and sock it." --BABE RUTH
Having better players makes "the right time" or "the big hit" happen a lot more often. PLUS PLUS
RMR does a great yearly post of what makes a meaningful sample. Maybe he will repost it.
"Bring on Rod Stupid!"
RedEye (08-26-2013)
No it doesn't mean he was "lucky", it doesn't mean anything at all really. It records the fact that he had those hits, but it says absolutely nothing about how Choo is likely to hit against Arroyo the next time they square off. It would be absurd to think that Choo would continue to hit .571 if he faced Arroyo another 100 times.
If you flipped a coin 10 times and got 8 heads and 2 Tails it doesn't mean that you are a good Heads-flipper. Your chance of getting Heads is still 50% regardless of the fact you "earned" an .800 batting average in your small sample.
If a batter has a "True Batting Average" of .300 it doesn't mean that he is going to hit exactly .300 in every sample. It means that he has a 30% chance of getting a hit each at-bat, but he will still have small samples where he gets a hit 40% or 60% or even 100% of the time, and in other samples he will get hits in only 20%, 10% or 0% of his at-bats. Even if that batter were to face the same pitcher in all 650 of his plate appearances during a season we wouldn't expect him to get exactly 3 hits in every 10 ABs.
Similarly, if you divide up a hitter's season into some arbitrary splits that have nothing to do with his talent or skills you are likely to get different results in those splits. For example, if you divided up Choo's season stats for each day of the week do you think his batting average will be exactly the same on Monday, Tuesday, Wednesday, Thursday, Friday, Saturday and Sunday? Probably not. If the results are different would you conclude that the day of the week has an affect on his hitting skill or would you conclude that it is random fluctuation (some people mistakenly call that luck) based on the sample size?
There have been quite a few studies done on batter/pitcher matchup stats and the results have shown these stats to have no predictive value.
It depends completely on what you're trying to do with the data. If you're talking about estimations of talent (which I assume you are), it still depends on what skill you're looking at. Certain attributes "stabilize" very quickly.
One of the notable sabermetricians (and real world stats teacher) "Pizza Cutter" used a method called "split-half reliability" to find the sample size at which the majority of the variation in a player's performance can be explained by factors within the player himself.
http://www.fangraphs.com/library/pri...s/sample-size/
You can ballpark estimate IP as BF (batters faced) divided 4 and BIP divided by 2Code:Stabilization Points for Offense Statistics: 60 PA: Strikeout rate 120 PA: Walk rate 240 PA: HBP rate 290 PA: Single rate 1610 PA: XBH rate 170 PA: HR rate 910 AB: AVG 460 PA: OBP 320 AB: SLG 160 AB: ISO 80 BIP: GB rate 80 BIP: FB rate 600 BIP: LD rate 50 FBs: HR per FB 820 BIP: BABIP Stabilization Points for Pitching Statistics: 70 BF: Strikeout rate 170 BF: Walk rate 640 BF: HBP rate 670 BF: Single rate 1450 BF: XBH rate 1320 BF: HR rate 630 BF: AVG 540 BF: OBP 550 AB: SLG 630 AB: ISO 70 BIP: GB rate 70 BIP: FB rate 650 BIP: LD rate 400 FB: HR per FB 2000BIP: BABIP
Last edited by RedsManRick; 08-26-2013 at 06:38 PM.
Games are won on run differential -- scoring more than your opponent. Runs are runs, scored or prevented they all count the same. Worry about scoring more and allowing fewer, not which positions contribute to which side of the equation or how "consistent" you are at your current level of performance.
threeve
dougdirt (08-27-2013)
Good post. however, to clarify, it's very possibly that Choo was "lucky". It's possible he had batted balls that usually become outs but which, for whatever reason, due to circumstances beyond his control, happened to fall in that day. Maybe the fielder slipped. Maybe he was just a poor defender. Maybe the wind was playing tricks with the ball.
As you described, the problem with the small sample is simply that we don't have enough observations for either the random variation of performance OR luck to even out.
Generally speaking, that player's underlying ability will result in a stable level of performance over a large enough sample. Some of that is just a function of him having more chances to fail/succeed. But it's also a function of his "luck" evening out. Again, as you point out, we should just be careful not to confuse true luck (external, idiosyncratic, but real, influences on the outcome) from performance variance which is within the player's control.
Games are won on run differential -- scoring more than your opponent. Runs are runs, scored or prevented they all count the same. Worry about scoring more and allowing fewer, not which positions contribute to which side of the equation or how "consistent" you are at your current level of performance.
Personally....for a hitter, after his second season. This allows for him to get a good start, pitchers adjust....he adjust back, sophomore slump.....then we'll see what he's made of. I wanna see Cespedes next year.
For pitchers.....a couple times thru the league and several situations of bad luck, calls situations.
Head to head....the best and worst need little. 4/5 with 2 doubles and a HR...he's seeing him well. 0-7 with 4 K...good enough for me. The 3 for 10's likely play out that way
A full season tells me about the team
Board Moderators may, at their discretion and judgment, delete and/or edit any messages that violate any of the following guidelines: 1. Explicit references to alleged illegal or unlawful acts. 2. Graphic sexual descriptions. 3. Racial or ethnic slurs. 4. Use of edgy language (including masked profanity). 5. Direct personal attacks, flames, fights, trolling, baiting, name-calling, general nuisance, excessive player criticism or anything along those lines. 6. Posting spam. 7. Each person may have only one user account. It is fine to be critical here - that's what this board is for. But let's not beat a subject or a player to death, please. |