Turn Off Ads?
Results 1 to 14 of 14

Thread: A question about small sample sizes

  1. #1
    Haunted by walks
    Join Date
    Apr 2000
    Location
    Syracuse
    Posts
    6,640

    A question about small sample sizes

    I heard from an engineering acquaintance recently that when we talk about a small sample size in baseball, we usually mean a small universe. As in, looking at the first week of at-bats would be a small universe. Looking at one in five at-bats (or maybe the more common "throw out this game and that game and he looks pretty good") would be a small sample size.

    What say you guys?

  2. Turn Off Ads?
  3. #2
    Viva la Rolen kaldaniels's Avatar
    Join Date
    Jul 2005
    Posts
    7,953

    Re: A question about small sample sizes

    It depends what stat you are talking about. Fangraphs had a nice article on it a while back.

  4. #3
    Member 757690's Avatar
    Join Date
    Mar 2007
    Location
    Dayton
    Posts
    10,276

    Re: A question about small sample sizes

    Well, I have always been told that size doesn't matter, but then again, they might just have been trying not to hurt my feelings
    "Man, the pitch looks fast, even in slow motion." Thom Brennaman on Chapman's fastball.

  5. #4
    Stat Wanker Hodiernus RedsManRick's Avatar
    Join Date
    Dec 2004
    Location
    Chicago, IL
    Posts
    15,988

    Re: A question about small sample sizes

    Quote Originally Posted by BCubb2003 View Post
    I heard from an engineering acquaintance recently that when we talk about a small sample size in baseball, we usually mean a small universe. As in, looking at the first week of at-bats would be a small universe. Looking at one in five at-bats (or maybe the more common "throw out this game and that game and he looks pretty good") would be a small sample size.

    What say you guys?
    Usually, when we're analyzing a players' stats, we're trying to guess about what he's likely to do moving forward. The way we do that is by estimating his "true skill".

    Of course, we can't directly measure a guys' physical and abilities. The only way we could know for sure is if we had essentially an infinite number of plate appearances including every possible situation. But that's not exactly feasible.

    So instead, we get a sample of plate appearances -- be it 5, 50, or 500. The size of sample you need to get a reliable estimate of the "true skill" depends on how much variability there is in the thing you're trying to estimate. There's no magic number; it's just that as your sample gets bigger the results in your sample become a more accurate reflection. And how quickly that happens depends on how much noise (variance) there is in that stat.

    (This assumes your sample is representative of the full population -- for example, if you get 500 PA against Roy Halladay, your performance in those PA is not going to be reflective of your overall skill.)

    So, yes, when we're talking about small sample, we do mean sample -- a sample from the universe of all possible matchups, situations, etc. that a better can face.
    Games are won on run differential -- scoring more than your opponent. Runs are runs, scored or prevented they all count the same. Worry about scoring more and allowing fewer, not which positions contribute to which side of the equation or how "consistent" you are at your current level of performance.

  6. #5
    High five! nate's Avatar
    Join Date
    Sep 2005
    Location
    Irvine, CA
    Posts
    6,976

    Re: A question about small sample sizes

    Rick, it might be a good time for your annual posting of the sample size list.
    "Bring on Rod Stupid!"

  7. #6
    Haunted by walks
    Join Date
    Apr 2000
    Location
    Syracuse
    Posts
    6,640

    Re: A question about small sample sizes

    So, the weeks of the season would be the universe and the first two weeks would be the sample size? Similarly, a full season's approximate number of at-bats and and the at-bats so far?

  8. #7
    Five Tool Fool jojo's Avatar
    Join Date
    Nov 2006
    Posts
    18,852

    Re: A question about small sample sizes

    Quote Originally Posted by BCubb2003 View Post
    So, the weeks of the season would be the universe and the first two weeks would be the sample size? Similarly, a full season's approximate number of at-bats and and the at-bats so far?
    I think he's trying to explain the concept of population (I.e. the complete data set of all possible observations- the "universe" in his terms) versus the sample meant to estimate the population parameters.

    I guess the definition of population depends upon the goal of the analysis.

    Sample size is just the size or magnitude of the sample with a sample generally considered a better estimate of the population the larger the sample becomes (I.e. the closer it's size gets to the population).

    Typically when the goal is to estimate a player's true skill, a season may actually be the sample (so sample size might be 600 PAs) and the population would be a cohort of all similar players.

    So when an announcer evokes that batter A is 1 or 7 against pitcher B, it's a population in the sense it's the complete data set but it's worthless because as n estimate of true talent it's a hopelessly small sample that is completely beholden to randomness.

    That said we do know certain peripherals of a hitter or pitcher stabilizes after specific numbers of events so as a thumb in the air, we can assume that the parameter is a good estimate of true skill. For instance, a player's contact rate might stabilize after 100 PAs (I don't honestly remember if that's true but I'm trying to illustrate the point), so after 100 PAs our sample could be thought to estimate the population with little error.

    I don't know if this helps you or just adds to any confusion....
    "This isnít stats vs scouts - this is stats and scouts working together, building an organization that blends the best of both worlds. This is the blueprint for how a baseball organization should be run. And, whether the baseball men of the 20th century like it or not, this is where baseball is going."---Dave Cameron, U.S.S. Mariner

  9. #8
    Haunted by walks
    Join Date
    Apr 2000
    Location
    Syracuse
    Posts
    6,640

    Re: A question about small sample sizes

    I think the engineer's point was that when you sample, you poll one out of 1,000 people, or you inspect one out of 100 parts, but do you look at one of every 10 at-bats? No. You look at all of the at-bats, but "all of the at-bats" happens to be small right now.

  10. #9
    Five Tool Fool jojo's Avatar
    Join Date
    Nov 2006
    Posts
    18,852

    Re: A question about small sample sizes

    Quote Originally Posted by BCubb2003 View Post
    I think the engineer's point was that when you sample, you poll one out of 1,000 people, or you inspect one out of 100 parts, but do you look at one of every 10 at-bats? No. You look at all of the at-bats, but "all of the at-bats" happens to be small right now.
    Well the second law of thermodynamics is often interpretted as meaning the universe tends toward maximum disorder and a sample of ten at bats is essentially hopelessly beholden to randomness, so ya, maybe a sample of ten at bats is a universe.

    My sense is that the engineering definition doesn't align well with a sabermetric one... Also your engineering buddy would probably punch me in the nose for conflating entropy and disorder....
    "This isnít stats vs scouts - this is stats and scouts working together, building an organization that blends the best of both worlds. This is the blueprint for how a baseball organization should be run. And, whether the baseball men of the 20th century like it or not, this is where baseball is going."---Dave Cameron, U.S.S. Mariner

  11. #10
    Joe Oliver love-child Blimpie's Avatar
    Join Date
    Jan 2005
    Location
    Lexington
    Posts
    4,896

    Re: A question about small sample sizes

    Let's just simplify this discussion.

    If you don't believe in the concept of sampling, then the next time you visit your Doctor--ask them to take all of your blood.

    "Booing on opening day is like telling grandma her house smells like old lady."--WOY

  12. #11
    Five Tool Fool jojo's Avatar
    Join Date
    Nov 2006
    Posts
    18,852

    Re: A question about small sample sizes

    Quote Originally Posted by Blimpie View Post
    Let's just simplify this discussion.

    If you don't believe in the concept of sampling, then the next time you visit your Doctor--ask them to take all of your blood.

    Sampling is perfectly appropriate so long as the sample taken is sufficient to estimate the target end point.
    "This isnít stats vs scouts - this is stats and scouts working together, building an organization that blends the best of both worlds. This is the blueprint for how a baseball organization should be run. And, whether the baseball men of the 20th century like it or not, this is where baseball is going."---Dave Cameron, U.S.S. Mariner

  13. #12
    Haunted by walks
    Join Date
    Apr 2000
    Location
    Syracuse
    Posts
    6,640

    Re: A question about small sample sizes

    My question was really only about the terminology: Are we looking at every at-bat, say, even if it's small, or are we taking one out of every 10 or so, which would be a true sample. It seems like when we talk about a "small sample size," we aren't really talking about a sampling in the technical sense.

  14. #13
    Five Tool Fool jojo's Avatar
    Join Date
    Nov 2006
    Posts
    18,852

    Re: A question about small sample sizes

    Quote Originally Posted by BCubb2003 View Post
    My question was really only about the terminology: Are we looking at every at-bat, say, even if it's small, or are we taking one out of every 10 or so, which would be a true sample. It seems like when we talk about a "small sample size," we aren't really talking about a sampling in the technical sense.
    Generally an estimate of true skill is being inferred in those types of conversations so say if it's 30 at bats, it's a sample....

    An example would be the "Gomes has changed his approach" argument when he started the season walking at an obscene rate. He hadn't become a Prince Charming.

    But this isn't like an assembly line where you're grabbing every 100th widget to make sure it's within acceptable tolerances.
    "This isnít stats vs scouts - this is stats and scouts working together, building an organization that blends the best of both worlds. This is the blueprint for how a baseball organization should be run. And, whether the baseball men of the 20th century like it or not, this is where baseball is going."---Dave Cameron, U.S.S. Mariner

  15. #14
    Member Reds/Flyers Fan's Avatar
    Join Date
    Mar 2003
    Location
    Cincinnati USA
    Posts
    3,380

    Re: A question about small sample sizes

    Small sample size or not, if any of you don't believe this club has serious deficiencies in its batting order, you're whistling while Rome burns.

    This isn't a small sample size at all (11 games) ... this has been an ongoing problem since the 2010 All-Star break. We have a "clean-up hitter" who has all of eight home runs in the last two years. Even Paul Janish scoffs at those numbers.

    But hey ... they'll turn it around, huh? Just one offensive, stat-padding explosion and all is OK, right?


Turn Off Ads?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

Board Moderators may, at their discretion and judgment, delete and/or edit any messages that violate any of the following guidelines: 1. Explicit references to alleged illegal or unlawful acts. 2. Graphic sexual descriptions. 3. Racial or ethnic slurs. 4. Use of edgy language (including masked profanity). 5. Direct personal attacks, flames, fights, trolling, baiting, name-calling, general nuisance, excessive player criticism or anything along those lines. 6. Posting spam. 7. Each person may have only one user account. It is fine to be critical here - that's what this board is for. But let's not beat a subject or a player to death, please.

Thank you, and most importantly, enjoy yourselves!


RedsZone.com is a privately owned website and is not affiliated with the Cincinnati Reds or Major League Baseball


Contact us: Boss | GIK | BCubb2003 | dabvu2498 | Gallen5862 | LexRedsFan | Plus Plus | RedlegJake | redsfan1995 | The Operator | Tommyjohn25