Turn Off Ads?
Page 1 of 2 12 LastLast
Results 1 to 15 of 17

Thread: Sample Size

  1. #1
    Member Ironman92's Avatar
    Join Date
    Apr 2012
    Posts
    4,560

    Sample Size

    What enough of a sample size for you?

    Hitter starting their career?

    Innings for a pitcher?

    Head to head pitching/hitting match ups?

    How good a team is?

  2. Turn Off Ads?
  3. #2
    Member
    Join Date
    Feb 2007
    Location
    Cincinnati/Colerain
    Posts
    1,288

    Re: Sample Size

    What enough of a sample size for you?

    Hitter starting their career? 1000 PA's

    Innings for a pitcher? 400

    Head to head pitching/hitting match ups? 40

    How good a team is? Pythag after 80 games


    But that's just my 2 cents.
    Follow me on twitter: http://twitter.com/ShaneHorning

  4. #3
    The Boss dougdirt's Avatar
    Join Date
    Jan 2006
    Posts
    35,773

    Re: Sample Size

    Quote Originally Posted by Ironman92 View Post
    What enough of a sample size for you?

    Hitter starting their career?

    Innings for a pitcher?

    Head to head pitching/hitting match ups?

    How good a team is?
    It all depends.

    Starting their career.... what did their minor league career and scouting report say? How are the peripherals?

    Innings for a pitcher.... depends on what question I am trying to answer.

    Depends. If a guy is 10-10 off of a guy, it is enough of a sample. If he is 3-10, it isn't.

    A team.... that depends. Do the team stats look like they are right or is the team sporting a .340 BABIP? Is the team relying on pitching with a team BABIP against of .250?

  5. #4
    KungFu Fighter AtomicDumpling's Avatar
    Join Date
    Mar 2007
    Location
    Hamilton, OH
    Posts
    2,774

    Re: Sample Size

    Quote Originally Posted by Ironman92 View Post
    What enough of a sample size for you?

    Hitter starting their career?

    Innings for a pitcher?

    Head to head pitching/hitting match ups?

    How good a team is?
    Hitter starting their career? 1200 plate appearances

    Innings for a pitcher? 300 innings

    Head to head pitching/hitting match ups? 1200 plate appearances (pitcher/hitter matchup stats are totally worthless)

    How good a team is? 162 games of pythag

    The larger the sample size the more certainty you have. You can still develop an estimate of a player's value with a smaller sample size but you would have a very large margin of error which shrinks very slowly as the sample size grows.
    Last edited by AtomicDumpling; 08-26-2013 at 05:02 AM.

  6. #5
    Member
    Join Date
    Jun 2013
    Posts
    814

    Re: Sample Size

    Quote Originally Posted by AtomicDumpling View Post
    Hitter starting their career? 1200 plate appearances

    Innings for a pitcher? 300 innings

    Head to head pitching/hitting match ups? 1200 plate appearances (pitcher/hitter matchup stats are totally worthless)

    How good a team is? 162 games of pythag

    The larger the sample size the more certainty you have. You can still develop an estimate of a player's value with a smaller sample size but you would have a very large margin of error which shrinks very slowly as the sample size grows.
    Even when the sample sizes are small, pitcher/hitter matchup stats are not totally worthless.
    Below are Choo's batting lines vs Reds pitchers when he was an Indian.

    Code:
                                                                        
                       PA AB  H 2B 3B HR RBI BB SO   BA  OBP   SLG   OPS
    Bronson Arroyo     15 14  8  3  0  4   7  1  0 .571 .600 1.643 2.243
    Homer Bailey       11  8  3  0  1  0   0  3  2 .375 .545  .625 1.170
    Mat Latos           7  7  3  1  0  1   2  0  2 .429 .429 1.000 1.429
    Mike Leake          6  5  3  1  0  2   2  1  1 .600 .667 2.000 2.667
    Sam LeCure          5  2  0  0  0  0   0  3  0 .000 .600  .000  .600
    Alfredo Simon       3  3  1  0  0  0   1  0  2 .333 .333  .333  .667
    Jonathan Broxton    2  1  0  0  0  0   0  1  0 .000 .500  .000  .500
    Aroldis Chapman     2  2  1  0  0  0   0  0  0 .500 .500  .500 1.000
    J.J. Hoover         2  2  0  0  0  0   0  0  2 .000 .000  .000  .000
    Logan Ondrusek      1  1  0  0  0  0   0  0  0 .000 .000  .000  .000
    Total              54 45 19  5  1  7  12  9  9 .422 .519 1.044 1.563
    Against Arroyo, 8(3 doubles 4 homers) for 14.
    Was Choo just lucky? No.
    Last edited by junkhead; 08-26-2013 at 08:08 AM.

  7. #6
    The Big Dog mth123's Avatar
    Join Date
    Jul 2006
    Posts
    14,945

    Re: Sample Size

    Quote Originally Posted by AtomicDumpling View Post
    Hitter starting their career? 1200 plate appearances

    Innings for a pitcher? 300 innings

    Head to head pitching/hitting match ups? 1200 plate appearances (pitcher/hitter matchup stats are totally worthless)

    How good a team is? 162 games of pythag

    The larger the sample size the more certainty you have. You can still develop an estimate of a player's value with a smaller sample size but you would have a very large margin of error which shrinks very slowly as the sample size grows.
    Mostly agree with this. I think there are definitely Hitter/Pitcher match-ups that should be avoided/exploited depending on your perspective, but I'm not sure the Stats are where to look for that information.

    Less than these still can give clues, but I generally am a believer that first year stats are not to be trusted when projecting a guy going forward and second year stats may actually mislead the opposite direction. I think the third full season tells the tale. Part-time stats can be skewed by managing a guys match-ups to where he succeeds (platoons, pitcher type, etc.). They can possibly tell something about how to match him up, but as we've repeatedly seen (Nunnally, Stynes, Heisey, Paul, Janish, etc), have little relationship to how a guy would perform on a daily basis.
    "All I can tell them is pick a good one and sock it." --BABE RUTH

    Having better players makes "the right time" or "the big hit" happen a lot more often. PLUS PLUS

  8. #7
    High five! nate's Avatar
    Join Date
    Sep 2005
    Location
    Irvine, CA
    Posts
    6,976

    Re: Sample Size

    Quote Originally Posted by junkhead View Post
    Even when the sample sizes are small, pitcher/hitter matchup stats are not totally worthless.
    Below are Choo's batting lines vs Reds pitchers when he was an Indian.

    Code:
                                                                        
                       PA AB  H 2B 3B HR RBI BB SO   BA  OBP   SLG   OPS
    Bronson Arroyo     15 14  8  3  0  4   7  1  0 .571 .600 1.643 2.243
    Homer Bailey       11  8  3  0  1  0   0  3  2 .375 .545  .625 1.170
    Mat Latos           7  7  3  1  0  1   2  0  2 .429 .429 1.000 1.429
    Mike Leake          6  5  3  1  0  2   2  1  1 .600 .667 2.000 2.667
    Sam LeCure          5  2  0  0  0  0   0  3  0 .000 .600  .000  .600
    Alfredo Simon       3  3  1  0  0  0   1  0  2 .333 .333  .333  .667
    Jonathan Broxton    2  1  0  0  0  0   0  1  0 .000 .500  .000  .500
    Aroldis Chapman     2  2  1  0  0  0   0  0  0 .500 .500  .500 1.000
    J.J. Hoover         2  2  0  0  0  0   0  0  2 .000 .000  .000  .000
    Logan Ondrusek      1  1  0  0  0  0   0  0  0 .000 .000  .000  .000
    Total              54 45 19  5  1  7  12  9  9 .422 .519 1.044 1.563
    Against Arroyo, 8(3 doubles 4 homers) for 14.
    Was Choo just lucky? No.
    If you're trying to determine what would happen going forward, these are worthless. I wouldn't even consider the total to be meaningful.

    If you're trying to say "yep, that's what happened," these are priceless.
    "Bring on Rod Stupid!"

  9. #8
    High five! nate's Avatar
    Join Date
    Sep 2005
    Location
    Irvine, CA
    Posts
    6,976

    Re: Sample Size

    RMR does a great yearly post of what makes a meaningful sample. Maybe he will repost it.
    "Bring on Rod Stupid!"

  10. Likes:

    RedEye (08-26-2013)

  11. #9
    KungFu Fighter AtomicDumpling's Avatar
    Join Date
    Mar 2007
    Location
    Hamilton, OH
    Posts
    2,774

    Re: Sample Size

    Quote Originally Posted by junkhead View Post
    Even when the sample sizes are small, pitcher/hitter matchup stats are not totally worthless.
    Below are Choo's batting lines vs Reds pitchers when he was an Indian.

    Code:
                                                                        
                       PA AB  H 2B 3B HR RBI BB SO   BA  OBP   SLG   OPS
    Bronson Arroyo     15 14  8  3  0  4   7  1  0 .571 .600 1.643 2.243
    Homer Bailey       11  8  3  0  1  0   0  3  2 .375 .545  .625 1.170
    Mat Latos           7  7  3  1  0  1   2  0  2 .429 .429 1.000 1.429
    Mike Leake          6  5  3  1  0  2   2  1  1 .600 .667 2.000 2.667
    Sam LeCure          5  2  0  0  0  0   0  3  0 .000 .600  .000  .600
    Alfredo Simon       3  3  1  0  0  0   1  0  2 .333 .333  .333  .667
    Jonathan Broxton    2  1  0  0  0  0   0  1  0 .000 .500  .000  .500
    Aroldis Chapman     2  2  1  0  0  0   0  0  0 .500 .500  .500 1.000
    J.J. Hoover         2  2  0  0  0  0   0  0  2 .000 .000  .000  .000
    Logan Ondrusek      1  1  0  0  0  0   0  0  0 .000 .000  .000  .000
    Total              54 45 19  5  1  7  12  9  9 .422 .519 1.044 1.563
    Against Arroyo, 8(3 doubles 4 homers) for 14.
    Was Choo just lucky? No.
    No it doesn't mean he was "lucky", it doesn't mean anything at all really. It records the fact that he had those hits, but it says absolutely nothing about how Choo is likely to hit against Arroyo the next time they square off. It would be absurd to think that Choo would continue to hit .571 if he faced Arroyo another 100 times.

    If you flipped a coin 10 times and got 8 heads and 2 Tails it doesn't mean that you are a good Heads-flipper. Your chance of getting Heads is still 50% regardless of the fact you "earned" an .800 batting average in your small sample.

    If a batter has a "True Batting Average" of .300 it doesn't mean that he is going to hit exactly .300 in every sample. It means that he has a 30% chance of getting a hit each at-bat, but he will still have small samples where he gets a hit 40% or 60% or even 100% of the time, and in other samples he will get hits in only 20%, 10% or 0% of his at-bats. Even if that batter were to face the same pitcher in all 650 of his plate appearances during a season we wouldn't expect him to get exactly 3 hits in every 10 ABs.

    Similarly, if you divide up a hitter's season into some arbitrary splits that have nothing to do with his talent or skills you are likely to get different results in those splits. For example, if you divided up Choo's season stats for each day of the week do you think his batting average will be exactly the same on Monday, Tuesday, Wednesday, Thursday, Friday, Saturday and Sunday? Probably not. If the results are different would you conclude that the day of the week has an affect on his hitting skill or would you conclude that it is random fluctuation (some people mistakenly call that luck) based on the sample size?

    There have been quite a few studies done on batter/pitcher matchup stats and the results have shown these stats to have no predictive value.

  12. #10
    High five! nate's Avatar
    Join Date
    Sep 2005
    Location
    Irvine, CA
    Posts
    6,976

    Re: Sample Size

    Quote Originally Posted by AtomicDumpling View Post
    If a batter has a "True Batting Average" of .300 it doesn't mean that he is going to hit exactly .300 in every sample. It means that he has a 30% chance of getting a hit each at-bat, but he will still have small samples where he gets a hit 40% or 60% or even 100% of the time, and in other samples he will get hits in only 20%, 10% or 0% of his at-bats.
    This is really the crux.

    Most splits outside of handedness have no predictive value.

    Historic? Yep, that's what happened.
    "Bring on Rod Stupid!"

  13. #11
    Stat Wanker Hodiernus RedsManRick's Avatar
    Join Date
    Dec 2004
    Location
    Guelph, ON
    Posts
    16,145

    Re: Sample Size

    It depends completely on what you're trying to do with the data. If you're talking about estimations of talent (which I assume you are), it still depends on what skill you're looking at. Certain attributes "stabilize" very quickly.

    One of the notable sabermetricians (and real world stats teacher) "Pizza Cutter" used a method called "split-half reliability" to find the sample size at which the majority of the variation in a player's performance can be explained by factors within the player himself.

    http://www.fangraphs.com/library/pri...s/sample-size/

    Code:
    Stabilization Points for Offense Statistics:
    
    60   PA: Strikeout rate
    120  PA: Walk rate
    240  PA: HBP rate
    290  PA: Single rate
    1610 PA: XBH rate
    170  PA: HR rate
    910  AB: AVG
    460  PA: OBP
    320  AB: SLG
    160  AB: ISO
    80  BIP: GB rate
    80  BIP: FB rate
    600 BIP: LD rate
    50  FBs: HR per FB
    820 BIP: BABIP
    
    
    Stabilization Points for Pitching Statistics:
    
    70   BF: Strikeout rate
    170  BF: Walk rate
    640  BF: HBP rate
    670  BF: Single rate
    1450 BF: XBH rate
    1320 BF: HR rate
    630  BF: AVG
    540  BF: OBP
    550  AB: SLG
    630  AB: ISO
    70  BIP: GB rate
    70  BIP: FB rate
    650 BIP: LD rate
    400  FB: HR per FB
    2000BIP: BABIP
    You can ballpark estimate IP as BF (batters faced) divided 4 and BIP divided by 2
    Last edited by RedsManRick; 08-26-2013 at 07:38 PM.
    Games are won on run differential -- scoring more than your opponent. Runs are runs, scored or prevented they all count the same. Worry about scoring more and allowing fewer, not which positions contribute to which side of the equation or how "consistent" you are at your current level of performance.

  14. #12
    Member
    Join Date
    Dec 2008
    Location
    mason
    Posts
    748

    Re: Sample Size

    threeve

  15. Likes:

    dougdirt (08-27-2013)

  16. #13
    Stat Wanker Hodiernus RedsManRick's Avatar
    Join Date
    Dec 2004
    Location
    Guelph, ON
    Posts
    16,145

    Re: Sample Size

    Quote Originally Posted by AtomicDumpling View Post
    No it doesn't mean he was "lucky", it doesn't mean anything at all really. It records the fact that he had those hits, but it says absolutely nothing about how Choo is likely to hit against Arroyo the next time they square off. It would be absurd to think that Choo would continue to hit .571 if he faced Arroyo another 100 times.
    Good post. however, to clarify, it's very possibly that Choo was "lucky". It's possible he had batted balls that usually become outs but which, for whatever reason, due to circumstances beyond his control, happened to fall in that day. Maybe the fielder slipped. Maybe he was just a poor defender. Maybe the wind was playing tricks with the ball.

    As you described, the problem with the small sample is simply that we don't have enough observations for either the random variation of performance OR luck to even out.

    Generally speaking, that player's underlying ability will result in a stable level of performance over a large enough sample. Some of that is just a function of him having more chances to fail/succeed. But it's also a function of his "luck" evening out. Again, as you point out, we should just be careful not to confuse true luck (external, idiosyncratic, but real, influences on the outcome) from performance variance which is within the player's control.
    Games are won on run differential -- scoring more than your opponent. Runs are runs, scored or prevented they all count the same. Worry about scoring more and allowing fewer, not which positions contribute to which side of the equation or how "consistent" you are at your current level of performance.

  17. #14
    Member Ironman92's Avatar
    Join Date
    Apr 2012
    Posts
    4,560

    Re: Sample Size

    Personally....for a hitter, after his second season. This allows for him to get a good start, pitchers adjust....he adjust back, sophomore slump.....then we'll see what he's made of. I wanna see Cespedes next year.

    For pitchers.....a couple times thru the league and several situations of bad luck, calls situations.

    Head to head....the best and worst need little. 4/5 with 2 doubles and a HR...he's seeing him well. 0-7 with 4 K...good enough for me. The 3 for 10's likely play out that way

    A full season tells me about the team

  18. #15
    Member Old school 1983's Avatar
    Join Date
    Apr 2013
    Posts
    1,840

    Re: Sample Size

    Quote Originally Posted by davereds24 View Post
    threeve
    Texas with a dollar sign


Turn Off Ads?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

Board Moderators may, at their discretion and judgment, delete and/or edit any messages that violate any of the following guidelines: 1. Explicit references to alleged illegal or unlawful acts. 2. Graphic sexual descriptions. 3. Racial or ethnic slurs. 4. Use of edgy language (including masked profanity). 5. Direct personal attacks, flames, fights, trolling, baiting, name-calling, general nuisance, excessive player criticism or anything along those lines. 6. Posting spam. 7. Each person may have only one user account. It is fine to be critical here - that's what this board is for. But let's not beat a subject or a player to death, please.

Thank you, and most importantly, enjoy yourselves!


RedsZone.com is a privately owned website and is not affiliated with the Cincinnati Reds or Major League Baseball


Contact us: Boss | GIK | BCubb2003 | dabvu2498 | Gallen5862 | LexRedsFan | Plus Plus | RedlegJake | redsfan1995 | The Operator | Tommyjohn25