Jump to content

Here is a Problem I have with WAR


waroriole

Recommended Posts

So, are you saying that a pitcher has no control over whether a batter hits a ball off the top of the wall or whether he hits a slow roller back to the mound? I understand there are other factors than the 3 mentioned, but those are the biggest components right? It's hard to put much stock in a system that gives me those answers.

Yeah I don't get this either. Why is the pitcher being penalized by Fangraphs because he gets the batter to put the ball in play and his defense makes an out?

Link to comment
Share on other sites

  • Replies 99
  • Created
  • Last Reply
Yeah I don't get this either. Why is the pitcher being penalized by Fangraphs because he gets the batter to put the ball in play and his defense makes an out?

I'm sure they have some element that they think corrects that, but it's clear (IMO) that it's not enough.

Link to comment
Share on other sites

Or, it uses the things a pitcher can actually control to judge how well they are pitching. I mean, you can argue the value of the number but you sound like Joe Morgan's less-stat-friendly brother.

I think from my other posts here it's pretty clear that I am a stat-oriented guy. But that doesn't mean that every single statistic is good or valuable, and it doesn't mean that statistics can't be misused.

Looking only at Ks, BBs, and HRs is valuable because those stats tend to be excellent predictors of future pitching performance. But ultimately, a pitcher's value each year depends on how many runs he allows the other team to score, and we have a stat that measures that: ERA. BBref begins their calculation with the "pitcher's actual runs allowed". To use predictive stats to describe actual results is a bit like giving Kevin Gregg a blown save for today's game because he put three runners on base.

Now, Fangraphs makes the argument that pitchers have no control whatsoever over balls hit in play; they have some statistical support, and I believe that they are partially right. But it's not an argument that I accept fully, and Jeremy Guthrie is an excellent example of why. He has outperformed his FIP year in and year out; at this point, he is an 872 inning sample arguing that FIP is missing something.

Perhaps he's just gotten extremely lucky, but even if he has, I don't think he should be penalized for that when we try to measure his "value" relative to a replacement player.

Link to comment
Share on other sites

Yeah I don't get this either. Why is the pitcher being penalized by Fangraphs because he gets the batter to put the ball in play and his defense makes an out?

The main reason that fangraphs is rewarding Tillman and penalizing Guthrie is homeruns. Yes BABIP is part of it but Guthrie has allowed 9 homers, Britton 5 and Tillman only 2. Right there is the main part of your fWAR differential.

Link to comment
Share on other sites

Or, it uses the things a pitcher can actually control to judge how well they are pitching. I mean, you can argue the value of the number but you sound like Joe Morgan's less-stat-friendly brother.

It also uses a lot more information, so its probably going to come out with a closer result to what we think we see.

The REAL problem with WAR is that there isn't a single way of figuring it. Both Baseball Reference and Fangraphs have different numbers going in, even if at the end they tend to come to a similar agreement, so sometimes you get wild swings like this.

BB-Ref uses RA, then makes adjustments for defense. Fangraphs uses FIP, with minimal adjustments for defense needed as a result.

Since most defensive metrics used by BB-ref involve some sort of weighted plus-minus system (e.g. easy plays that lots of fielders make don't give you much credit, difficult plays give you a lot more) this tells you something that most of us already know: that Guthrie's lower BABIP is because he induces weaker contact that gives his fielders the chance to make plays. Any system that uses FIP is going to think Guthrie is worthless because it discounts his apparent ability to reduce BABIP.

Link to comment
Share on other sites

The main reason that fangraphs is rewarding Tillman and penalizing Guthrie is homeruns. Yes BABIP is part of it but Guthrie has allowed 9 homers, Britton 5 and Tillman only 2. Right there is the main part of your fWAR differential.

I thought Fangraphs uses xFIP, which normalizes homerun rate based on the pitcher's flyball rate.

Link to comment
Share on other sites

Fangraphs' pitcher WAR is unreliable, in my opinion at least. Use B-R's, or better yet, use a combination of stats that are more based in reality.

Other problems with WAR, copied from my massive "On the usage of statistics on the OH" thread:

WAR measures value, not quality. The two are not interchangeable. A player is not better because he posts a higher WAR value. Apart from the general problems with fielding metrics, WAR is a very good estimator of how valuable a player was to a team. But that's all it is.

First, it's a counting stat. Counting stats are inherently inferior to rate stats in evaluating quality, because they can value playing more at a lower quality over playing less at a higher quality, when the player who played at the higher quality was by definition a better player. But unlike strictly batting or fielding stats, there is no way to properly convert WAR into a rate stat. Adjusting by PA conflicts with the fielding aspect, and in an extreme case could turn a player who played only as a defensive replacement and accumulated maybe .2 WAR while only coming to the plate once into a more valuable player than someone who accumulated 8 WAR in 500 PA (.2/1 > 8/500). And adjusting by innings played would have a similar effect on players who were solely pinch hitters. So it's a counting stat, and must remain a counting stat. This is okay for value, but not okay at all for quality.

Second, WAR includes a positional adjustment. In Fangraphs' explanation of the positional adjustment they use, they admit to making three assumptions that are okay to make if the goal is to measure value, but for measuring quality they cannot be made. The assumptions are:

1. Major league teams are being perfectly efficient with who they put, and where.

2. Left-handed players and right-handed players can each play every position.

3. Offensive ability is not independent of the position being played.

It has been demonstrated time and time again that the same offensive production from a left fielder is less valuable than from a shortstop. But does that necessarily make the shortstop a better player? No, it doesn't, because you're assuming that the left fielder is not a good enough player to play shortstop. Positional adjustments mean that WAR is inherently dependent on the way a team uses its players, which is when it entirely leaves the realm of measuring quality.

So all WAR is good for is measuring a player's value to his team. Which means it's great for stuff like deciding who should win the MVP award, but is that really what we want out of a stat?

Let me use an example here. The second-best (after Cliff Lee) starting pitcher in the AL in 2008 in my opinion was Justin Duchscherer. His ERA+ was 163 and his WHIP was 0.995. But his WAR (using the B-R formula) was 3.9. John Danks also had a great year, with an ERA+ of 138 and a WHIP of 1.226. Great numbers, but I don't think anyone would argue he was a better pitcher than Duchscherer based solely on quality stats. Yet Danks' WAR was 6.4, simply because he pitched more.

Yes, Danks was more valuable, but value is often inhibited by circumstances beyond the player's control, be it injury, a stubborn manager, or teams gaming the service clock. Evaluating quality is better because it removes context, thereby reaching a more pure outcome that tells us something definitive and unqualified.

The more I think about it, the less I like that positional adjustment. I mean, I get why it's there, but in my opinion it is the sort of thing that needs to be handled on a case-by-case basis rather than be bundled into a mega-stat.

Link to comment
Share on other sites

BB-Ref uses RA, then makes adjustments for defense. Fangraphs uses FIP, with minimal adjustments for defense needed as a result.

Since most defensive metrics used by BB-ref involve some sort of weighted plus-minus system (e.g. easy plays that lots of fielders make don't give you much credit, difficult plays give you a lot more) this tells you something that most of us already know: that Guthrie's lower BABIP is because he induces weaker contact that gives his fielders the chance to make plays. Any system that uses FIP is going to think Guthrie is worthless because it discounts his apparent ability to reduce BABIP.

Nicely put! I hadn't thought of the quality-of-contact implications of the defensive metrics when a particular pitchers is on the field. I'd love to see a real baseball stat writer tackle that one. (with a question like: Do the Orioles play better defense behind Jeremy Guthrie than behind their other pitchers? And is it just luck?)

Link to comment
Share on other sites

I think from my other posts here it's pretty clear that I am a stat-oriented guy. But that doesn't mean that every single statistic is good or valuable, and it doesn't mean that statistics can't be misused.

Looking only at Ks, BBs, and HRs is valuable because those stats tend to be excellent predictors of future pitching performance. But ultimately, a pitcher's value each year depends on how many runs he allows the other team to score, and we have a stat that measures that: ERA. BBref begins their calculation with the "pitcher's actual runs allowed". To use predictive stats to describe actual results is a bit like giving Kevin Gregg a blown save for today's game because he put three runners on base.

Now, Fangraphs makes the argument that pitchers have no control whatsoever over balls hit in play; they have some statistical support, and I believe that they are partially right. But it's not an argument that I accept fully, and Jeremy Guthrie is an excellent example of why. He has outperformed his FIP year in and year out; at this point, he is an 872 inning sample arguing that FIP is missing something.

Perhaps he's just gotten extremely lucky, but even if he has, I don't think he should be penalized for that when we try to measure his "value" relative to a replacement player.

This is not true. The argument is that pitchers have plenty of control over whether their balls in play tend to get hit in the air, on the ground or on a line, probabilistically speaking.

And, each one of those types of batted balls tends to fall for a hit at a mean percentage. Some pitchers can deviate a bit from the mean for an extended time, but anything beyond two standard deviations can be considered an outlier.

K rates, BB rates, and (to a lesser extent) HR rates tend to remain more static from season to season than do BABIP and hit rates. Because of this, they have more predictive value and we can infer that pitchers tend to control those them more.

I'm not arguing that Fangraphs assessment of pitcher value is appropriate, just clarifying a point.

Link to comment
Share on other sites

Ok, its absolutely idiotic than someone can try to quantify a player's worth by putting stats into a formula. Baseball is a TEAM game, not just a collection of individuals. There are skills players bring that are not calculated in WAR. To objectively assess a player, you must watch him play, consider his skills and opponents, then look at a good sample of his statistics and evaluate how he fits into your team's needs. This cannot be done by glancing at Fangraphs website or whatever.

Link to comment
Share on other sites

Never once have I bought any WAR stat. I know what I see, and I know what the important stats are. Pitching wise the only stat that matters is did you give me a chance to win today. Guts allowed 5 runs but was able to keep the team in it and give them a chance to win. That's all that matters. Sure it doesn't help the old ERA but my take away from the game is he pitched 7 innings, left with a tied game and went deep enough to not expose the pen. That's basically all you can ask out of any starter.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.




×
×
  • Create New...