I've tried (and largely failed) to articulate this position before, so I'm very pleased to see FanGraphs themselves do it much better than I could.
And here's the ESPN article showing how PBP defensive metrics work in the context of Curtis Granderson: http://espn.go.com/blog/new-york/yan...sily-evaluated
That’s right — in fact, it’s so right I’ll write again, embolden’d even: One year of UZR data is enough.
The proper question, however, is, “Enough for what?”But definitely read the whole article.A single season of UZR is a point along the spectrum of true talent, but the error range for that single point is so large as to mean almost nothing. So yes, one season of UZR is not enough to determine a players true talent level.
Bronzing a player’s glove after a single season would be as foolish as enshrining a hitter after just 400 plate appearances.
HOWEVER! That is only when we are talking about true talent levels. Many people who use UZR suppose incorrectly that since a single season does not accurately report true talent levels we can effectively ignore that season until there’s a wealth of data.
But just because a single season does not tell us a player’s true talent, it does not mean that year’s UZR tells us nothing. To the contrary, the data shows us the story of the season, it describes the season precisely.
Really, it's pretty intuitive. If a guy hits .400 for 200 PA, we don't want to call him a ".400 hitter". But he really did get hits in 40% of his at bats. That's not to say UZR is perfect, obviously. But we should be considering our question before dismissing it due to small sample size. Small samples are a problem if you want to use them to estimate the whole population -- but as measures of that specific thing, they can be perfectly fine.
To be fair, the author probably overstates the case slightly. UZR is still an estimation, a translation of what happened, not a direct measurement -- unlike, say, batting average, which just counts events and does not place differential estimated value on them.
Lastly, I would just say, be thoughtful about your own perceptions. Human memory is a funny thing. What we choose (subconsciously) to flag as memorable is based in large degree on our existing beliefs and understandings. And when we do reflect, we choose a narrative first and then create the images. In short, our memories aren't good at systematically recalling and summarizing series' of events over time. So when we try to judge a guy's defense, in the absence of data we immerse ourselves in that shapes the narrative (like we do with OPS), we're likely defaulting back to some big-picture summary of ability rather than actual performance over the last year. Useful for getting a reasonable estimate of true-talent, not so useful for a given year. If the data conflicts with our perception, we should consider it's our perceptions that are unreliable.