Turn Off Ads?
Page 2 of 3 FirstFirst 123 LastLast
Results 16 to 30 of 34

Thread: Better descriptive stats?

  1. #16
    Bullpen or whatever RedEye's Avatar
    Join Date
    Feb 2006
    Location
    Atlanta, GA
    Posts
    9,297

    Re: Better descriptive stats?

    Love that graph up there, but it bothers me a bit that sample size is not somehow referenced. Not sure how to solve that problem -- but it seems odd that Robinson and Paul look like "better" hitters than Bruce when, quite obviously, that's not the case given the number of PAs that Bruce has compared to those two guys.


  2. Turn Off Ads?
  3. #17
    Bullpen or whatever RedEye's Avatar
    Join Date
    Feb 2006
    Location
    Atlanta, GA
    Posts
    9,297

    Re: Better descriptive stats?

    In other news, Mesoraco vs. Hanigan looks like a race to the bottom.

  4. #18
    Sprinkles are for winners dougdirt's Avatar
    Join Date
    Jan 2006
    Posts
    49,393

    Re: Better descriptive stats?

    Quote Originally Posted by RedEye View Post
    Love that graph up there, but it bothers me a bit that sample size is not somehow referenced. Not sure how to solve that problem -- but it seems odd that Robinson and Paul look like "better" hitters than Bruce when, quite obviously, that's not the case given the number of PAs that Bruce has compared to those two guys.
    I believe that the thickness of each bar is representative of their PAs.

  5. #19
    High five!
    Join Date
    Sep 2005
    Location
    Irvine, CA
    Posts
    6,976

    Re: Better descriptive stats?

    Quote Originally Posted by RedsManRick View Post
    Here's a graphical version of the 6 stat version. It sure reinforces just how much of hitting is failure as well as how much Votto & Choo have outpaced everybody else and how poorly Cozart has hit.

    Support I could have sorted by wOBA, but you get the idea.
    Great chart. Very interesting to see the ratio of XBH to singles for Bruce and Heisey.

  6. #20
    Bullpen or whatever RedEye's Avatar
    Join Date
    Feb 2006
    Location
    Atlanta, GA
    Posts
    9,297

    Re: Better descriptive stats?

    Quote Originally Posted by dougdirt View Post
    I believe that the thickness of each bar is representative of their PAs.
    Ah, okay. Duh. Don't mind me -- just trying to keep up. LOL.
    “Every level he goes to, he is going to compete. They will know who he is at every level he goes to.” -- ED on EDLC

  7. #21
    Member LeDoux's Avatar
    Join Date
    Apr 2010
    Location
    Knoxville
    Posts
    716

    Re: Better descriptive stats?

    I love the chart. It provokes a reappraisal of the players offensive production. I am wondering if babip could also be worked in to show the presumed role of luck in those numbers. Or maybe shades of red to show fly outs, ld%, etc? Just a thought.

  8. #22
    High five!
    Join Date
    Sep 2005
    Location
    Irvine, CA
    Posts
    6,976

    Re: Better descriptive stats?

    Instead of thickness representing PA, it might be interesting to use the overall bar length for that. Then you'd what each player does in proportion.

  9. Likes:

    RedEye (07-15-2013)

  10. #23
    Member mdccclxix's Avatar
    Join Date
    Sep 2009
    Location
    Crown
    Posts
    4,139

    Re: Better descriptive stats?

    Quote Originally Posted by RedsManRick View Post
    Ever since I was kid, I had an intuition that bugged me about baseball stats. What are we trying to measure? Ultimately, I've come to realize that we ended up crafting statistics without full recognition that they were of varying use for the two basic kinds of questions we ask:

    What happened?
    What's likely to happen next?

    We realize that these are highly correlated, but also that there are nuances, exceptions, etc. In short, we have to grapple with the difference between "descriptive" and "inferential" statistics. And this constantly frustrates us. We want the elegance and simplicity of descriptive statistics with the real world value of of the inferential.

    Case in point: "Batting average". It's a simple measure of fact, descriptive, right? It's just "hits" per "at bat". Two numbers. Of course, as well all know, it's hardly that simple.

    We realize that sometimes when a player bats the ball and reaches first base, it's not because he hit the ball well but because the fielder screwed up. So created a special stat called a "hit", in which we subjectively decide whether the action of "batter hits ball and reaches base" is really earned or not. If the fielder "should have" made the play, we don't give the hitter credit for getting a "hit".

    We also realize that sometimes the batter doesn't get a fair chance to hit the ball. So we take all the times he walks up to hit and subtract out the times when he gets walked (be it earned via 4 balls, HBP or catcher's interference). It wouldn't be fair to count those plate appearances against him as if he failed to hit, right? And of course, sometimes the batter still did something good by advancing a runner even though he didn't reach base himself, so we should subtract out those sacrifices too, right?

    You get the idea. We took something that has the appearance of the record of a simple frequency of an event and we layered in a bunch of conditions to it so that it would be (supposedly) more meaningful -- it would tell us more about the player and about his contribution to the event, if maybe a little less about the event itself.

    If we just want to show what happened in the past, why are we making it so convoluted? And if want to measure "how good" the player was, shouldn't we be accounting for a lot more than that? It always seemed like we had something that didn't actually do a good job at telling us much of anything -- other than create an artificial sense of what it meant to be "good" at hitting.

    So, as I walked my dog this morning, I got to thinking. I know how to take the inferential stuff to a more useful place (e.g. wOBA), but would there be more value in just getting a clearer picture of "what happened". Instead of the still complicated slash line, can we make it simpler?

    So I pulled together these little tables. Imagine if this is what showed up on the TV screen instead of AVG/HR/RBI. Firstly, note I use percentages instead of counts -- there's a reason we currently show AVG instead of hits -- the same logic should apply to any outcome showed in such a context. The second thing to note is that I only went to two digits. What's the purpose of the 3rd digit on than the appearance of meaning. Does knowing a guy gets a walk 12.3% of the time give us more information than knowing he walked 11.9%? Or if you prefer do you knowing any more about a .275 hitter than a .281 one? I agree it just creates the appearance of knowledge.

    Code:
    Name			PA	Hit%	Walk%	Out%
    Shin-Soo Choo		430	23%	20%	57%
    Joey Votto		424	26%	20%	54%
    Jay Bruce		407	25%	 8%	67%
    Brandon Phillips	383	24%	 9%	67%
    Zack Cozart		375	21%	 4%	74%
    Todd Frazier		348	21%	13%	67%
    Devin Mesoraco		191	20%	12%	68%
    Xavier Paul		184	22%	13%	65%
    Ryan Hanigan		168	17%	16%	67%
    Derrick Robinson	147	23%	11%	66%
    Chris Heisey		113	19%	 5%	75%
    Jack Hannahan		101	20%	11%	69%
    Cesar Izturis		90	19%	10%	71%
    Donald Lutz		59	24%	 2%	75%
    But that might not be quite enough info, so what if we broke it down in two meaningful pieces of each of those (sorted in descending order of value):

    XBH: Extra-Base Hit
    Sng: Single
    eBB: Earned Walk
    uBB: Unearned Walk
    PO: Productive Out (Sacrifice)
    upO: Unproductive Out
    Code:
    Name			PA	Hit%	Walk%	Out%		XBH%	Sng%	eBB%	uBB%	PO%	uPO%
    Shin-Soo Choo		430	23%	20%	57%		 8%	15%	14%	6%	1%	57%
    Joey Votto		424	26%	20%	54%		 8%	18%	16%	4%	1%	53%
    Jay Bruce		407	25%	 8%	67%		11%	14%	 7%	1%	1%	66%
    Brandon Phillips	383	24%	 9%	67%		 7%	17%	 7%	3%	2%	65%
    Zack Cozart		375	21%	 4%	74%		 8%	14%	 4%	1%	5%	70%
    Todd Frazier		348	21%	13%	67%		 8%	13%	10%	3%	1%	66%
    Devin Mesoraco		191	20%	12%	68%		 6%	15%	10%	2%	2%	65%
    Xavier Paul		184	22%	13%	65%		 8%	14%	11%	2%	0%	65%
    Ryan Hanigan		168	17%	16%	67%		 5%	12%	11%	5%	1%	66%
    Derrick Robinson	147	23%	11%	66%		 5%	18%	10%	1%	1%	65%
    Chris Heisey		113	19%	 5%	75%		11%	 9%	 4%	2%	4%	71%
    Jack Hannahan		101	20%	11%	69%		 5%	15%	 9%	2%	1%	68%
    Cesar Izturis		 90	19%	10%	71%		 4%	14%	 8%	2%	1%	70%
    Donald Lutz		 59	24%	 2%	75%		 3%	20%	 2%	0%	0%	75%
    So I'm not proposing anything. I'm not doing any analysis. I'm just posing the question: would looking at the data this way add value -- particularly in a broadcast context in which high quality analysis and interpretation of nuance is unreasonable to expect.
    My initial reactions were that the simple version was refreshingly telling of a player's outcomes, but that the second version was a little too much to look at on TV between pitches.

    Now that I look at it one more time, I'm thinking that to the average viewer it may take a while to adjust to what the numbers truly mean. The difference between Joey Votto's 26% hit% and Izturis' 19% hit% just doesn't feel like it registers the full gap between those two players. Teaching people that accounting for their walk% is very important is really a whole different approach.

    Basically, what I think a challenge would be is supplanting a stat that is utterly and completely rooted in decades of fans minds. I think you'd have to chop it down and poison it somehow, and it still could take a long time to monitor that it doesn't come back.

    But I do like it, a lot.

  11. #24
    My clutch is broken RichRed's Avatar
    Join Date
    Mar 2006
    Location
    Western NC, by way of VB, VA
    Posts
    4,410

    Re: Better descriptive stats?

    Quote Originally Posted by FlightRick View Post
    I like this. Anything that not only pushes more-useful information out there, but does it in an intuitive/easy-to-understand-for-everyone way is good.

    For purposes of TV broadcasts, I think it'd be super simple to present, too. I'm not concerned about the distinction between earned BB and unearned, but you could make a chyron/graphic that includes the rest of the 6-outcome stat line that looks nice and compact:

    Code:
    19/1b  JOEY VOTTO
    =============================
    2013 season:
    HIT%(xbh)    BB%    OUT%(sac)
    26% (8%)     20%    54% (1%)
    I like this idea (and RMR's graph is excellent). One thing I'd like to see in a broadcast would be a comparison to the rest of the league to add context. Maybe list Votto's outcomes, then under that would be league average and perhaps position average.

    Nice work.
    "I can make all the stadiums rock."
    -Air Supply

  12. Likes:

    thatcoolguy_22 (07-15-2013)

  13. #25
    Battle Toad Historian thatcoolguy_22's Avatar
    Join Date
    May 2006
    Location
    Myrtle Beach SC
    Posts
    2,004

    Re: Better descriptive stats?

    Quote Originally Posted by RichRed View Post
    I like this idea (and RMR's graph is excellent). One thing I'd like to see in a broadcast would be a comparison to the rest of the league to add context. Maybe list Votto's outcomes, then under that would be league average and perhaps position average.

    Nice work.
    Adding context is huge. Its one thing to say Votto makes and out 54% of the time, something else when you show the avergae 1B does it 65%. That 9% is huge over 650+ PAs.

    I want to see Votto vs Ike Davis before he was sent down

  14. #26
    Member RedsManRick's Avatar
    Join Date
    Dec 2004
    Location
    Guelph, ON
    Posts
    19,448

    Re: Better descriptive stats?

    Quote Originally Posted by mdccclxix View Post
    My initial reactions were that the simple version was refreshingly telling of a player's outcomes, but that the second version was a little too much to look at on TV between pitches.

    Now that I look at it one more time, I'm thinking that to the average viewer it may take a while to adjust to what the numbers truly mean. The difference between Joey Votto's 26% hit% and Izturis' 19% hit% just doesn't feel like it registers the full gap between those two players. Teaching people that accounting for their walk% is very important is really a whole different approach.

    Basically, what I think a challenge would be is supplanting a stat that is utterly and completely rooted in decades of fans minds. I think you'd have to chop it down and poison it somehow, and it still could take a long time to monitor that it doesn't come back.

    But I do like it, a lot.
    Yeah, that was my take too. The differences in simple hit % are much less than what people likely intuit because they're so used to batting average -- and people aren't used to thinking of batting average as a ratio of hits:outs (I posit that the term "at bat" makes people think in terms of likelihood of getting a hit this plate appearance, even though that's obviously not what it means). People probably "feel" the difference between players in terms of hits, but the reality is more a function of walks and power.

    I agree it would take the casual fan a lot getting used to -- but I think it's intuitive enough that if it was just something like what FlightRick put together: PA, Hit% (XBH%), BB%, Out% people could quite easily adjust.
    Last edited by RedsManRick; 07-15-2013 at 12:18 PM.
    Games are won on run differential -- scoring more than your opponent. Runs are runs, scored or prevented they all count the same. Worry about scoring more and allowing fewer, not which positions contribute to which side of the equation or how "consistent" you are at your current level of performance.

  15. #27
    Member mdccclxix's Avatar
    Join Date
    Sep 2009
    Location
    Crown
    Posts
    4,139

    Re: Better descriptive stats?

    Quote Originally Posted by RedsManRick View Post
    Yeah, that was my take to. The differences in simple hit % are much less than what people likely intuit. People probably "feel" the difference between players in terms of hits, but the reality is more a function of walks and power.

    I agree it would take the casual fan a lot getting used to -- but I think it's intuitive enough that if it was just something like what FlightRick put together: PA, Hit% (XBH%), BB%, Out% people could quite easily adjust.
    I did like FlightRick's layout better as well. So, are you okay with leaving OB% behind then? I realize it can be pretty quickly calculated here.

    Not knowing what all the stats are that players look at (to each their own), I wonder what the prevalence of these stats would do to the game. Instead of saying, "Votto is batting .400 against me" a pitcher might say "Votto only has a 25% chance of hitting this".

  16. #28
    Member RedsManRick's Avatar
    Join Date
    Dec 2004
    Location
    Guelph, ON
    Posts
    19,448

    Re: Better descriptive stats?

    Quote Originally Posted by mdccclxix View Post
    I did like FlightRick's layout better as well. So, are you okay with leaving OB% behind then? I realize it can be pretty quickly calculated here.

    Not knowing what all the stats are that players look at (to each their own), I wonder what the prevalence of these stats would do to the game. Instead of saying, "Votto is batting .400 against me" a pitcher might say "Votto only has a 25% chance of hitting this".
    Sort of. I don't think you'll move people wholly to OBP until you show it as having different components that are meaningful -- otherwise it feels like a lateral move in which you're forcing them to think that walks are as important as hits. You simple can't get people to accept that the most important aspect of a plate appearance in terms of runs is whether or not it's an out when their emotional attention is so attuned to the ball being hit in to the field of play and falling for a hit. But by showing the components of that "non-out", I think you can help them get there.

    What I would want is to first put the emphasis on outs vs. not-outs (from "how likely is it that this guy gets a hit?" to "how likely is it that this guy gets on base?"), followed by how productive he is in not making outs.

    I could actually see broadcasts using something like wOBA that was more intuitive, such as TB (inclusive of walks) per PA.

    Player: Hit% | Walk% | Out%: TB/PA (or wOBA)

    It's important to remember that the biggest hurdle to any of this is not the inherent logic of the suggestion. It's the ability to improve what's being done without people feeling like you're taking away what they already care about (because they grew up with it). You have to get them to a point of internalizing, emotionally, the value of this breakdown and for most people, the emotional connection with what they grew up with is a lot stronger than their emotional connection to objective truth (a reality that sadly carries much beyond baseball).
    Last edited by RedsManRick; 07-15-2013 at 01:59 PM.
    Games are won on run differential -- scoring more than your opponent. Runs are runs, scored or prevented they all count the same. Worry about scoring more and allowing fewer, not which positions contribute to which side of the equation or how "consistent" you are at your current level of performance.

  17. #29
    Charlie Brown All-Star IslandRed's Avatar
    Join Date
    May 2001
    Location
    Melbourne, FL
    Posts
    5,042

    Re: Better descriptive stats?

    Quote Originally Posted by RedsManRick View Post
    If we just want to show what happened in the past, why are we making it so convoluted? And if want to measure "how good" the player was, shouldn't we be accounting for a lot more than that?
    Well, I think you basically explained it yourself even though it wasn't stated outright. The stats that were developed and took root back in the early days of baseball were the simplest possible numbers that told people basically what they wanted to know. Counting things and averages, slightly tweaked. And even a casual fan could tie those numbers back to what they observed in the games.

    Simplicity is key. That great-looking chart you posted? Casual Fan takes one look at that and eyes start glazing over. He'll infer that more green and less red is better, but beyond that... Anyway, remember that on the TV screen and the scoreboard, you're almost always showing only one person's stats at a time so the relative context will be lost.

    As for changing the expression of numbers to percentages -- honestly, you'd hit less resistance if you don't. The most Casual Fan of casual fans knows how to read ".283". "28%" is not really an improvement -- it's not more exact, it's not meaningfully shorter to read or to say, and it's not even more easily understood except by those who have never seen a baseball stat. It's change for the sake of change IMO.

    Besides, "he's a three-hundred hitter" rolls off the tongue. "He's a thirty-percent hitter" doesn't.

    Great thread, by the way.
    Reading comprehension is not just an ability, it's a choice

  18. #30
    Member mdccclxix's Avatar
    Join Date
    Sep 2009
    Location
    Crown
    Posts
    4,139

    Re: Better descriptive stats?

    You know, I wouldn't go out of the way to pander to the crowd, I'd just start putting the new slash up, mention it a few times here and there and let the people find out it's value. Be the authority and force the crowd to catch up.

    Joey Votto 2-3, BB, HR Today
    26/20/54

    Almost like a horse racing program with all the codes you have to get up to snuff on. Indiscreet, but prominent, and definitely indispensable.

    The racing comparison is also strangely apt since I do feel these new numbers do sort of lend themselves to a pure odds perspective. I worry a little bit that the game would be reduced to slot machine analysis.


Turn Off Ads?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

Board Moderators may, at their discretion and judgment, delete and/or edit any messages that violate any of the following guidelines: 1. Explicit references to alleged illegal or unlawful acts. 2. Graphic sexual descriptions. 3. Racial or ethnic slurs. 4. Use of edgy language (including masked profanity). 5. Direct personal attacks, flames, fights, trolling, baiting, name-calling, general nuisance, excessive player criticism or anything along those lines. 6. Posting spam. 7. Each person may have only one user account. It is fine to be critical here - that's what this board is for. But let's not beat a subject or a player to death, please.

Thank you, and most importantly, enjoy yourselves!


RedsZone.com is a privately owned website and is not affiliated with the Cincinnati Reds or Major League Baseball


Contact us: Boss | Gallen5862 | Plus Plus | Powel Crosley | RedlegJake | The Operator