Over the past few weeks, the latest battle between sabermetrics and old school stats has centered on the American League Cy Young award. The old school advocates have coalesced around C.C. Sabathia and his league leading 19 wins, while the sabermetricians have argued in favor od Felix Hernandez and his overall statistical superiority. As it turns out, however, the debate has been all for naught because at least one advanced metric has come around to the old school way of thinking.
After taking into account park factors based on 2010 data, baseball-reference.com’s new calculation of Wins Above Replacement (WAR) now confirms that C.C. Sabathia has been the league’s best pitcher. Before the adjustment, Sabathia ranked fifth in WAR, or over one half win behind Felix Hernandez, whose 11-11 record has made him a non-candidate on many main stream ballots. Now that Sabathia has jumped to the head of the class, however, there is no need for the conflict.
B-R.com’s WAR Adjustment, Based on Revised Park Factors
AL Pitching – Old | AL Pitching – New | |||
Player | WAR | Player | WAR | |
Hernandez (SEA) | 5.2 | Sabathia (NYY) | 5.4 | |
Price (TBR) | 5.1 | Liriano (MIN) | 5 | |
Weaver (LAA) | 5 | Hernandez (SEA) | 4.9 | |
Liriano (MIN) | 4.7 | Pavano (MIN) | 4.9 | |
Pavano (MIN) | 4.6 | Weaver (LAA) | 4.8 | |
Sabathia (NYY) | 4.6 | Lester (BOS) | 4.6 | |
Lester (BOS) | 4.5 | Price (TBR) | 4.6 | |
Wilson (TEX) | 4.5 | Wilson (TEX) | 4.4 | |
Guthrie (BAL) | 4.4 | Buchholz (BOS) | 4.4 | |
Buchholz (BOS) | 4.4 | . | Guthrie (BAL) | 4.4 |
Danks (CHW) | 4.4 |
Source: Baseball-reference.com
But wait? According to fangraphs, Sabathia still ranks eighth in WAR, two runs behind leader Francisco Liriano, not Felix Hernandez. So, what gives?
Fangraphs’ WAR
Player | WAR |
Francisco Liriano | 6.3 |
Cliff Lee | 6.3 |
Felix Hernandez | 5.9 |
Jon Lester | 5.5 |
Jered Weaver | 5.1 |
Justin Verlander | 5 |
Zack Greinke | 4.8 |
CC Sabathia | 4.3 |
Gavin Floyd | 4.3 |
John Danks | 4.1 |
Source: fangraphs.com
The “dirty little secret” about many advanced metrics is they are based on subjective variables. That’s not really a secret to those with a strong understanding of how they are calculated, but it probably comes as a surprise to more casual sabermetricians who cite the statistics as gospel. In reality, however, components such as replacement level value, positional adjustments, park factors and defensive metrics (and the underlying principles of each) are just some of the differing underlying variables that make up many of the new sabermetric approaches to analysis. In the case of WAR, which seeks to define total value, all of these components are involved. Hence the variance between different sources, not to mention the changes that occur when better data is accumulated.
There are lots of reasons to like stats like WAR, but just as many reasons to be leery of them. Without an advanced degree in statistics, it may not be easy to come to grips with concepts like regression analysis and linear weights, but that doesn’t mean the byproducts should be dismissed out of hand. By the same token, however, one should not suffer from deference to complexity. That which is not understood isn’t always right. Ultimately, the value of a statistic comes not only from its accuracy, but also its ease of application. If sabermetric proponents were more aware of that, the old school holdouts might be a lot easier to convert.
It’s useful to point out that there must be differences in the ways that fangraphs and bbref compute WAR; it would have been considerably more useful to point out even one of the differences. The reader is left with no idea why (for example) fangraphs puts Cliff Lee at 6.3 while bbref has him at least 2 games worse.
And what is subjective about positional adjustments and park factors?
Absolutely…there are significant differences in how they each compute WAR. Should have been more explicit stating that instead of only implying. I thought that was clear, but I guess not. I didn’t get into the breakdown of what is different about the two because it is technical and besides the main point (i.e. sabremetrics are still subjective).
The subjective part about positional adjustments and park factors are how those variables are calculated. For example, you could use one-year, three-year or five-year park factors. As we see with the B-R data, that decision makes a big difference.
I think using 1-year-based park adjustments is the most iffy factor. Trades, players in contract seasons, etc. just add too many variables to a single season beyond a ballpark’s actual factors.
OTOH, there are nonsubjective factors that do change year to year, like weather.