Turn Off Ads?
Page 1 of 3 123 LastLast
Results 1 to 15 of 34

Thread: Better descriptive stats?

  1. #1
    Stat Wanker Hodiernus RedsManRick's Avatar
    Join Date
    Dec 2004
    Location
    Chicago, IL
    Posts
    16,000

    Better descriptive stats?

    Ever since I was kid, I had an intuition that bugged me about baseball stats. What are we trying to measure? Ultimately, I've come to realize that we ended up crafting statistics without full recognition that they were of varying use for the two basic kinds of questions we ask:

    What happened?
    What's likely to happen next?

    We realize that these are highly correlated, but also that there are nuances, exceptions, etc. In short, we have to grapple with the difference between "descriptive" and "inferential" statistics. And this constantly frustrates us. We want the elegance and simplicity of descriptive statistics with the real world value of of the inferential.

    Case in point: "Batting average". It's a simple measure of fact, descriptive, right? It's just "hits" per "at bat". Two numbers. Of course, as well all know, it's hardly that simple.

    We realize that sometimes when a player bats the ball and reaches first base, it's not because he hit the ball well but because the fielder screwed up. So created a special stat called a "hit", in which we subjectively decide whether the action of "batter hits ball and reaches base" is really earned or not. If the fielder "should have" made the play, we don't give the hitter credit for getting a "hit".

    We also realize that sometimes the batter doesn't get a fair chance to hit the ball. So we take all the times he walks up to hit and subtract out the times when he gets walked (be it earned via 4 balls, HBP or catcher's interference). It wouldn't be fair to count those plate appearances against him as if he failed to hit, right? And of course, sometimes the batter still did something good by advancing a runner even though he didn't reach base himself, so we should subtract out those sacrifices too, right?

    You get the idea. We took something that has the appearance of the record of a simple frequency of an event and we layered in a bunch of conditions to it so that it would be (supposedly) more meaningful -- it would tell us more about the player and about his contribution to the event, if maybe a little less about the event itself.

    If we just want to show what happened in the past, why are we making it so convoluted? And if want to measure "how good" the player was, shouldn't we be accounting for a lot more than that? It always seemed like we had something that didn't actually do a good job at telling us much of anything -- other than create an artificial sense of what it meant to be "good" at hitting.

    So, as I walked my dog this morning, I got to thinking. I know how to take the inferential stuff to a more useful place (e.g. wOBA), but would there be more value in just getting a clearer picture of "what happened". Instead of the still complicated slash line, can we make it simpler?

    So I pulled together these little tables. Imagine if this is what showed up on the TV screen instead of AVG/HR/RBI. Firstly, note I use percentages instead of counts -- there's a reason we currently show AVG instead of hits -- the same logic should apply to any outcome showed in such a context. The second thing to note is that I only went to two digits. What's the purpose of the 3rd digit on than the appearance of meaning. Does knowing a guy gets a walk 12.3% of the time give us more information than knowing he walked 11.9%? Or if you prefer do you knowing any more about a .275 hitter than a .281 one? I agree it just creates the appearance of knowledge.

    Code:
    Name			PA	Hit%	Walk%	Out%
    Shin-Soo Choo		430	23%	20%	57%
    Joey Votto		424	26%	20%	54%
    Jay Bruce		407	25%	 8%	67%
    Brandon Phillips	383	24%	 9%	67%
    Zack Cozart		375	21%	 4%	74%
    Todd Frazier		348	21%	13%	67%
    Devin Mesoraco		191	20%	12%	68%
    Xavier Paul		184	22%	13%	65%
    Ryan Hanigan		168	17%	16%	67%
    Derrick Robinson	147	23%	11%	66%
    Chris Heisey		113	19%	 5%	75%
    Jack Hannahan		101	20%	11%	69%
    Cesar Izturis		90	19%	10%	71%
    Donald Lutz		59	24%	 2%	75%
    But that might not be quite enough info, so what if we broke it down in two meaningful pieces of each of those (sorted in descending order of value):

    XBH: Extra-Base Hit
    Sng: Single
    eBB: Earned Walk
    uBB: Unearned Walk
    PO: Productive Out (Sacrifice)
    upO: Unproductive Out
    Code:
    Name			PA	Hit%	Walk%	Out%		XBH%	Sng%	eBB%	uBB%	PO%	uPO%
    Shin-Soo Choo		430	23%	20%	57%		 8%	15%	14%	6%	1%	57%
    Joey Votto		424	26%	20%	54%		 8%	18%	16%	4%	1%	53%
    Jay Bruce		407	25%	 8%	67%		11%	14%	 7%	1%	1%	66%
    Brandon Phillips	383	24%	 9%	67%		 7%	17%	 7%	3%	2%	65%
    Zack Cozart		375	21%	 4%	74%		 8%	14%	 4%	1%	5%	70%
    Todd Frazier		348	21%	13%	67%		 8%	13%	10%	3%	1%	66%
    Devin Mesoraco		191	20%	12%	68%		 6%	15%	10%	2%	2%	65%
    Xavier Paul		184	22%	13%	65%		 8%	14%	11%	2%	0%	65%
    Ryan Hanigan		168	17%	16%	67%		 5%	12%	11%	5%	1%	66%
    Derrick Robinson	147	23%	11%	66%		 5%	18%	10%	1%	1%	65%
    Chris Heisey		113	19%	 5%	75%		11%	 9%	 4%	2%	4%	71%
    Jack Hannahan		101	20%	11%	69%		 5%	15%	 9%	2%	1%	68%
    Cesar Izturis		 90	19%	10%	71%		 4%	14%	 8%	2%	1%	70%
    Donald Lutz		 59	24%	 2%	75%		 3%	20%	 2%	0%	0%	75%
    So I'm not proposing anything. I'm not doing any analysis. I'm just posing the question: would looking at the data this way add value -- particularly in a broadcast context in which high quality analysis and interpretation of nuance is unreasonable to expect.
    Last edited by RedsManRick; 07-14-2013 at 02:13 PM.
    Games are won on run differential -- scoring more than your opponent. Runs are runs, scored or prevented they all count the same. Worry about scoring more and allowing fewer, not which positions contribute to which side of the equation or how "consistent" you are at your current level of performance.

  2. Likes:

    Always Red (07-14-2013), BillDoran (07-14-2013), Billy Hamilton's Legs (07-14-2013), BluegrassRedleg (07-15-2013), NebraskaRed (07-14-2013), reds1869 (07-14-2013), RichRed (07-15-2013), thatcoolguy_22 (07-15-2013), Tom Servo (07-14-2013), vaticanplum (07-14-2013)

  3. Turn Off Ads?
  4. #2
    Member Tom Servo's Avatar
    Join Date
    May 2006
    Posts
    7,420

    Re: Better descriptive stats?

    Great stuff, RMR!
    "Since I've been with the Reds in 1989, we've never had a farm system this loaded," Bowden said. "If we were the New York Yankees and had unlimited dollars, we could have traded for Colon, (Jeff) Weaver, Rolen, (Cliff) Floyd, (Kenny) Rogers and Finley and gotten them all -- and still held onto our top five prospects. That's an amazing statement."

  5. #3
    The Boss dougdirt's Avatar
    Join Date
    Jan 2006
    Posts
    35,295

    Re: Better descriptive stats?

    I don't like the productive out, because it is rather limited. Why do we give someone credit for a sac fly as a productive out, but not a ground out to second base that also scores a run? If we can accept that a player "tried" to hit a deep fly ball, why can't we accept that they tried to hit a grounder to the second baseman who was playing normal depth? In both cases I refuse to believe that the player tried for that outcome, but it was a happy accident.

  6. Likes:

    gilpdawg (07-14-2013), thatcoolguy_22 (07-15-2013)

  7. #4
    It's showtime! RedEye's Avatar
    Join Date
    Feb 2006
    Location
    Atlanta, GA
    Posts
    7,986

    Re: Better descriptive stats?

    Love this. Any way you can make that table more readable though? Right now it is just a jumble of numbers on my screen.
    "Iíll kind of have a foot on the back of my own butt. Thatís just how I do things.Ē -- Bryan Price, 10/22/2013

  8. #5
    Member
    Join Date
    Feb 2006
    Location
    North Dakota
    Posts
    1,033

    Re: Better descriptive stats?

    I like it a lot. Wonder how these percentages compare to Pit and StL?

  9. #6
    The Boss dougdirt's Avatar
    Join Date
    Jan 2006
    Posts
    35,295

    Re: Better descriptive stats?

    Quote Originally Posted by RedEye View Post
    Love this. Any way you can make that table more readable though? Right now it is just a jumble of numbers on my screen.
    Looks fine to me. Anyone else having problems?

  10. #7
    It's showtime! RedEye's Avatar
    Join Date
    Feb 2006
    Location
    Atlanta, GA
    Posts
    7,986

    Re: Better descriptive stats?

    Quote Originally Posted by dougdirt View Post
    Looks fine to me. Anyone else having problems?
    Maybe it's b/c I'm on my iPhone?
    "Iíll kind of have a foot on the back of my own butt. Thatís just how I do things.Ē -- Bryan Price, 10/22/2013

  11. #8
    The Boss dougdirt's Avatar
    Join Date
    Jan 2006
    Posts
    35,295

    Re: Better descriptive stats?

    Quote Originally Posted by RedEye View Post
    Maybe it's b/c I'm on my iPhone?
    Yes, having an iPhone is always the problem. You should fix that and get one of those cool Android devices.

  12. Likes:

    Dan (07-15-2013), thatcoolguy_22 (07-15-2013)

  13. #9
    Member Norm Chortleton's Avatar
    Join Date
    Apr 2012
    Posts
    1,403

    Re: Better descriptive stats?

    Quote Originally Posted by dougdirt View Post
    I don't like the productive out, because it is rather limited. Why do we give someone credit for a sac fly as a productive out, but not a ground out to second base that also scores a run? If we can accept that a player "tried" to hit a deep fly ball, why can't we accept that they tried to hit a grounder to the second baseman who was playing normal depth? In both cases I refuse to believe that the player tried for that outcome, but it was a happy accident.
    I agree that we should keep track of productive outs. I don't think we should reward a player by not making a ground out count as an AB (only because it would screw up 100 years of how we have kept stats), but I do think there should be a category for it.

    BTW, players can't always hit a ground ball to second on command, but they should be able to hit the ball to the right side on command. If it's a hit, fine. If it's a ground ball that advances the runner, fine. If it's a line drive right at someone, oh well, it was still a good AB, just not a productive one.

  14. #10
    The Boss dougdirt's Avatar
    Join Date
    Jan 2006
    Posts
    35,295

    Re: Better descriptive stats?

    Quote Originally Posted by Norm Chortleton View Post
    I agree that we should keep track of productive outs. I don't think we should reward a player by not making a ground out count as an AB (only because it would screw up 100 years of how we have kept stats), but I do think there should be a category for it.

    BTW, players can't always hit a ground ball to second on command, but they should be able to hit the ball to the right side on command. If it's a hit, fine. If it's a ground ball that advances the runner, fine. If it's a line drive right at someone, oh well, it was still a good AB, just not a productive one.
    It is just silly to me that at the point in the game where guys probably could actually have some control over where they hit the ball, they decided a fly ball that scored a run on an out didn't count, but the groundball that did the same thing does.

  15. #11
    The Big Dog mth123's Avatar
    Join Date
    Jul 2006
    Posts
    14,771

    Re: Better descriptive stats?

    Quote Originally Posted by dougdirt View Post
    It is just silly to me that at the point in the game where guys probably could actually have some control over where they hit the ball, they decided a fly ball that scored a run on an out didn't count, but the groundball that did the same thing does.
    I get your point, but the distinction is probably that the fly ball usually gets the job done and the defene can't do anything to counter whether it is hit deep enough. A grounder to second is really at the defense' discretion. The IF could play-in and prevent it from being successful.
    "All I can tell them is pick a good one and sock it." --BABE RUTH

    Having better players makes "the right time" or "the big hit" happen a lot more often. PLUS PLUS

  16. #12
    High five! nate's Avatar
    Join Date
    Sep 2005
    Location
    Irvine, CA
    Posts
    6,976

    Re: Better descriptive stats?

    Nice idea.

    I've always liked "hit type percentages" as an easier to grasp version of wOBA.
    "Bring on Rod Stupid!"

  17. #13
    The Boss dougdirt's Avatar
    Join Date
    Jan 2006
    Posts
    35,295

    Re: Better descriptive stats?

    Quote Originally Posted by mth123 View Post
    I get your point, but the distinction is probably that the fly ball usually gets the job done and the defene can't do anything to counter whether it is hit deep enough. A grounder to second is really at the defense' discretion. The IF could play-in and prevent it from being successful.
    The pitcher could have just struck the batter out. The defense could have put in a guy with a better arm in the outfield. They could have signed a sumo wrestler to literally sit on home plate so you couldn't score.

    I know, extremes.... I just don't like it.

  18. #14
    Stat Wanker Hodiernus RedsManRick's Avatar
    Join Date
    Dec 2004
    Location
    Chicago, IL
    Posts
    16,000

    Re: Better descriptive stats?

    Here's a graphical version of the 6 stat version. It sure reinforces just how much of hitting is failure as well as how much Votto & Choo have outpaced everybody else and how poorly Cozart has hit.

    Support I could have sorted by wOBA, but you get the idea.
    Games are won on run differential -- scoring more than your opponent. Runs are runs, scored or prevented they all count the same. Worry about scoring more and allowing fewer, not which positions contribute to which side of the equation or how "consistent" you are at your current level of performance.

  19. #15
    You're Welcome
    Join Date
    May 2003
    Location
    The Mythological Land of Dayton, OH
    Posts
    405

    Re: Better descriptive stats?

    I like this. Anything that not only pushes more-useful information out there, but does it in an intuitive/easy-to-understand-for-everyone way is good.

    For purposes of TV broadcasts, I think it'd be super simple to present, too. I'm not concerned about the distinction between earned BB and unearned, but you could make a chyron/graphic that includes the rest of the 6-outcome stat line that looks nice and compact:

    Code:
    19/1b  JOEY VOTTO
    =============================
    2013 season:
    HIT%(xbh)    BB%    OUT%(sac)
    26% (8%)     20%    54% (1%)
    Would it be at all helpful to do the same thing for pitchers? Hits allowed pct. (xbh), walks allowed pct., and outs pct. (Ks) across the graphic, maybe? If nothing else, it's equally informative for both starters and relievers. [Unlike the metric crap-ton of other ideas I have for reforming pitching stats.]


    Rick

  20. Likes:

    *BaseClogger* (07-15-2013)


Turn Off Ads?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

Board Moderators may, at their discretion and judgment, delete and/or edit any messages that violate any of the following guidelines: 1. Explicit references to alleged illegal or unlawful acts. 2. Graphic sexual descriptions. 3. Racial or ethnic slurs. 4. Use of edgy language (including masked profanity). 5. Direct personal attacks, flames, fights, trolling, baiting, name-calling, general nuisance, excessive player criticism or anything along those lines. 6. Posting spam. 7. Each person may have only one user account. It is fine to be critical here - that's what this board is for. But let's not beat a subject or a player to death, please.

Thank you, and most importantly, enjoy yourselves!


RedsZone.com is a privately owned website and is not affiliated with the Cincinnati Reds or Major League Baseball


Contact us: Boss | GIK | BCubb2003 | dabvu2498 | Gallen5862 | LexRedsFan | Plus Plus | RedlegJake | redsfan1995 | The Operator | Tommyjohn25