Jump to content

Here is a Problem I have with WAR


waroriole

Recommended Posts

The funny thing is that baseball traditionalists love stats. They just only love the ones they grew up with. So when you say "WAR doesn't belong in baseball", be aware that the reason I can't take you seriously is because stats are integral to baseball's history.

And if you want to critique a stat, it has to be a stat-based attack, which means you have to take the time to figure things out first. The eyeball test and the gut feeling are just not good enough. "Wah I think Britton is better than pitcher X but xFIP says he's worse"? Here are some facts for you:

1. Britton has not been pitching at an elite level.

2. Britton will not have a 2.35 ERA at the end of the season if his peripherals don't improve.

3. Britton will not strand 83.3% of his runners for the whole season.

4. Britton will not have a .241 BABIP for the whole season.

If a pitcher throws meatballs down the plate for 9 innings and Markakis saves a home run at the fence on every at-bat, the pitcher has a perfect 0.00 ERA and a high xFIP. Why? Because the pitcher performed poorly and got bailed out (by the defense, in this case). This is what xFIP (and FIP, and SIERA, etc. etc.) are for. Britton has gotten lucky so far this season and will regress if his game doesn't take a step forward. A lot of smart people have worked hard to come up with the best ways to date of calculating context-neutral performance so that teams can evaluate players better. And it works. So trying to belittle those advances like I've seen in this thread really just makes you look ignorant.

Link to comment
Share on other sites

  • Replies 99
  • Created
  • Last Reply
I don't disagree with anything you say, because I have no idea how WAR works, nor do I really care, my personal opinion is, that it would take the fun out of being a fan.

Either you're being hyperbolic, or you're the strangest baseball fan I've ever heard of. You're telling me that if you'd known what WAR was all about last week Adam Jones' walkoff homer would have been bland and meaningless?

I prefer to watch a player and drawm my own opinion.

I literally have no idea why you couldn't do that with WAR as but one input to your decision making process.

Having said that. Let me ask you this. Of course these metric's would be more useful if you were not seeing a player on a regular basis. I'm assuming you watch most or not all Oriole games. How often does WAR differ from the eye test on players you are seeing on a regular basis? How often does it occur, that you see a player regularly, have an opinion and then looking at his WAR, it shows something completely different?

I don't think I can separate my observations from a player's record. How would you judge Jones or Markakis or Wieters if you didn't know their batting average or number of homers or RBI? How do you judge Felix Pie's fielding based solely on your own observations, not including fielding percentage or what the guy in the Sun says or what your buddy tells you or the comments on the Hangout?

Link to comment
Share on other sites

1. Britton has not been pitching at an elite level.

2. Britton will not have a 2.35 ERA at the end of the season if his peripherals don't improve.

3. Britton will not strand 83.3% of his runners for the whole season.

4. Britton will not have a .241 BABIP for the whole season.

I agree with you in general, but I have an issue with point #1. Britton HAS been pitching at an elite level. He's needed some help from his defense to do it and is therefore unlikely to continue to pitch as well. But the fact that his performance involved luck doesn't make it a worse performance. It's still elite.

Again, I'll make an analogy to a hitter. If a hitter hits 10 straight ground balls, and they all make it through holes, he's hitting 1.000 and has a 2.000 OPS. He has certainly been lucky, is nearly certain to not hit as well in his next 10 at-bats, and is certain not to hit as well over the course of a month. But that doesn't mean his hitting in his last couple of games hasn't been "elite" or "spectacular" or whatever other positive attribute you care to use.

Though it has no predictive power and will disappear in the future, luck is part of the game and should be included in any measurement of past performance.

Link to comment
Share on other sites

Either you're being hyperbolic, or you're the strangest baseball fan I've ever heard of. You're telling me that if you'd known what WAR was all about last week Adam Jones' walkoff homer would have been bland and meaningless?

I literally have no idea why you couldn't do that with WAR as but one input to your decision making process.

I don't think I can separate my observations from a player's record. How would you judge Jones or Markakis or Wieters if you didn't know their batting average or number of homers or RBI? How do you judge Felix Pie's fielding based solely on your own observations, not including fielding percentage or what the guy in the Sun says or what your buddy tells you or the comments on the Hangout?

Im 48 years old and before I ever heard of fielding percentage and WAR, I knew that Paul Blair was a great fielder. I think you fail to realize or just dont want to accept the fact that this game was played LONG before any of these metrics were discovered. And yet somehow the baseball people were able to figure out who the better players were. Are they a tool now, sure. Are they NEEDED? Of course they arent. Are they perfect? Of course they aren't. They are as flawed as an experienced scout or baseball executive. I can take you to my sons Little league tommorrow morning and you could watch one game and probably name the 5-6 best kids on the field. I may be a strange baseball fan in your eyes, but trust me, im closer to the norm then you are. If you can not judge Wieters, Jones or Markakis without looking at stats, well I find that very odd and slightly sad.

Link to comment
Share on other sites

Let's put this in context, ok? The argument againt fWAR is similar to the argument against Democracy - which is the worst form of government except for all the others.

Really, all joking aside, fWAR makes some assumptions that have a lot of grounding in reality and rates a player. And does that, arguably (and I believe definitely) better than the traditional numbers used to rate pitchers. If you're going to eviscerate a stat for not being perfect, or taking a particular point of view and quantifying it, please be fair and rip into the other metrics, too.

I mean, we don't have long, involved threads where the entire community tears apart ERA. ERA is fantastically absurd on so many levels. It tells me that the very same pitcher isn't as valuable if his starting shortstop took the day off and a lesser fielder took his place. It tells me that a pitcher is totally absolved of all runs that score after a questionable scoring decision awards an error to one of his fielders with two outs. It tells me that a pitcher who pitches in PETCO with a 4.00 ERA is better than a pitcher who pitches to a 4.05 in Coors.

C'mon. WAR is a solid metric whose results are well matched with qualitative observations a big majority of the time, with an underlying set of assumptions that may or may not be what you're looking for. Most of the stuff commonly used tocay are halfway decent metrics with underlying assumptions that range from ridiculous to decent.

This thread reminds me a bit of the Markakis or Roberts criticisms. They have flaws and sometimes produce results that aren't what we expect, so let's talk mostly about how disappointing and overrated they are.

I agree that it is a useful stat, and I don't mean to completely beat it up. But, alot of people cite it b/c of it's perceived accuracy. I don't think it's a bad thing to discuss things that a stat might miss, especially when it's a stat that is cited frequently.

Link to comment
Share on other sites

If a pitcher throws meatballs down the plate for 9 innings and Markakis saves a home run at the fence on every at-bat, the pitcher has a perfect 0.00 ERA and a high xFIP. Why? Because the pitcher performed poorly and got bailed out (by the defense, in this case). This is what xFIP (and FIP, and SIERA, etc. etc.) are for. Britton has gotten lucky so far this season and will regress if his game doesn't take a step forward. A lot of smart people have worked hard to come up with the best ways to date of calculating context-neutral performance so that teams can evaluate players better. And it works. So trying to belittle those advances like I've seen in this thread really just makes you look ignorant.

What if Pitcher A gets 20 outs by inducing weak dribblers and pop ups. And Pitcher B gets 20 outs on line drives and long fly outs. If K, BB and HR allowed are equal, will xFIP assume that A & B are equal?

Link to comment
Share on other sites

What if Pitcher A gets 20 outs by inducing weak dribblers and pop ups. And Pitcher B gets 20 outs on line drives and long fly outs. If K, BB and HR allowed are equal, will xFIP assume that A & B are equal?

I believe this is the difference between FIP and xFIP. xFIP includes a normalized HR/FB ratio, so the pitcher who was giving up deeper hits would have a higher xFIP. (xFIP plugs in the league average of 10% as the pitchers HR/FB.) xFIP doesn't account for line drives, however, or a more general definition of "solid contact".

Link to comment
Share on other sites

[/b]

Im 48 years old and before I ever heard of fielding percentage and WAR, I knew that Paul Blair was a great fielder. I think you fail to realize or just dont want to accept the fact that this game was played LONG before any of these metrics were discovered. And yet somehow the baseball people were able to figure out who the better players were. Are they a tool now, sure. Are they NEEDED? Of course they arent. Are they perfect? Of course they aren't. They are as flawed as an experienced scout or baseball executive. I can take you to my sons Little league tommorrow morning and you could watch one game and probably name the 5-6 best kids on the field. I may be a strange baseball fan in your eyes, but trust me, im closer to the norm then you are. If you can not judge Wieters, Jones or Markakis without looking at stats, well I find that very odd and slightly sad.

Wow. I would suggest simply staying away from these stat discussions, because I just lost an awful lot of respect for you. Not in your beliefs, but your tone.

Link to comment
Share on other sites

I agree with you in general, but I have an issue with point #1. Britton HAS been pitching at an elite level. He's needed some help from his defense to do it and is therefore unlikely to continue to pitch as well. But the fact that his performance involved luck doesn't make it a worse performance. It's still elite.

Again, I'll make an analogy to a hitter. If a hitter hits 10 straight ground balls, and they all make it through holes, he's hitting 1.000 and has a 2.000 OPS. He has certainly been lucky, is nearly certain to not hit as well in his next 10 at-bats, and is certain not to hit as well over the course of a month. But that doesn't mean his hitting in his last couple of games hasn't been "elite" or "spectacular" or whatever other positive attribute you care to use.

Though it has no predictive power and will disappear in the future, luck is part of the game and should be included in any measurement of past performance.

Actually, it DOES mean that. And that is one of the times where "eyes" come in: observation lets you know that the batter, in that case, was extremely lucky and unlikely to continue at that pace. And if you can't watch the games, you'll be able to see the same thing by looking at certain stats.

That's why the ideal thing is to use both. Eyes are not perfect, and stats are like lenses. Some are corrective, to help you see better; others may be polarized to strain out light rays bad for you; others may be colored, rose or orange or very dark tints, and prevent you from seeing certain things. They all have their uses.

Link to comment
Share on other sites

Actually, it DOES mean that. And that is one of the times where "eyes" come in: observation lets you know that the batter, in that case, was extremely lucky and unlikely to continue at that pace. And if you can't watch the games, you'll be able to see the same thing by looking at certain stats.

That's why the ideal thing is to use both. Eyes are not perfect, and stats are like lenses. Some are corrective, to help you see better; others may be polarized to strain out light rays bad for you; others may be colored, rose or orange or very dark tints, and prevent you from seeing certain things. They all have their uses.

If you get good results because you were lucky, they were still good results. I can't say it any plainer.

As I've said, I have some (a lot, even) faith in the predictive power of the stats that go into WAR. But at the end of the day, WAR claims to measure performance, and IMO it doesn't (at least Fangraphs version of WAR for pitchers), because it tries to remove luck. If you don't think lucky results should be part of measured performance, we'll just have to agree to disagree.

Link to comment
Share on other sites

If you get good results because you were lucky, they were still good results. I can't say it any plainer.

As I've said, I have some (a lot, even) faith in the predictive power of the stats that go into WAR. But at the end of the day, WAR claims to measure performance, and IMO it doesn't (at least Fangraphs version of WAR for pitchers), because it tries to remove luck. If you don't think lucky results should be part of measured performance, we'll just have to agree to disagree.

Whether or not to normalize for HR rate is a legitimate debate. Luckily, it's an irrelevant one, as Fangraphs uses FIP instead of xFIP specifically because "WAR is designed to describe what actually happened, while xFIP involves regression and is more of a predictive stat".

However, WAR does and should adjust for the quality of the defense (using FIP instead of ERA). You have to, or you double-count defense. You want to figure out what the pitcher himself contributed. Not every out is created equal. A pitcher who gets an out on a routine grounder to second base deserves more credit for that out than if it's a laser home run that's caught at the wall, and a strikeout deserves the most credit of all. This is what FIP tries to do - and it's one of the best approximations of that concept that we have.

So basically, I think you just misunderstand how Fangraphs designed WAR. It does what you think it ought.

Link to comment
Share on other sites

However, WAR does and should adjust for the quality of the defense (using FIP instead of ERA). You have to, or you double-count defense. You want to figure out what the pitcher himself contributed. Not every out is created equal. A pitcher who gets an out on a routine grounder to second base deserves more credit for that out than if it's a laser home run that's caught at the wall, and a strikeout deserves the most credit of all. This is what FIP tries to do - and it's one of the best approximations of that concept that we have.

Humorously enough, FIP assigns an identical value to your two examples. The pitcher gets exactly zero credit for either out.

I'm not very impressed with the various defensive metrics, either; I don't think we understand the relationship between pitchers and defense. FIP treats all balls hit into play as out of the pitcher's control and therefore worthless when it comes to evaluating a pitcher's performance. I'm not sure they are right about that, but even if they are and a pitcher has simply gotten lucky (say, because all of the hard-hit balls off him went straight to fielders) - he did his job, got outs and prevented runs.

To give the extreme example, a pitcher who gets nothing but groundouts (no BB, no HR, no K) has a FIP of 3.20 and an ERA of 0.00. Now, the fact that all of those grounders turned into outs may have involved luck, but I think that the pitcher still generated "value" by getting them.

BB-ref also attempts to adjust for quality of defense in calculating WAR; they use a composite team defensive rating and adjust pitchers with weak defenses up (and vice versa). Jeremy Guthrie's defensive adjustment is currently bringing his rWAR up by 0.18 wins since the O's defense has (according to BB-ref) been 2 runs below average. That makes a lot more sense to me.

Link to comment
Share on other sites

Humorously enough, FIP assigns an identical value to your two examples. The pitcher gets exactly zero credit for either out.

I'm not very impressed with the various defensive metrics, either; I don't think we understand the relationship between pitchers and defense. FIP treats all balls hit into play as out of the pitcher's control and therefore worthless when it comes to evaluating a pitcher's performance. I'm not sure they are right about that, but even if they are and a pitcher has simply gotten lucky (say, because all of the hard-hit balls off him went straight to fielders) - he did his job, got outs and prevented runs.

To give the extreme example, a pitcher who gets nothing but groundouts (no BB, no HR, no K) has a FIP of 3.20 and an ERA of 0.00. Now, the fact that all of those grounders turned into outs may have involved luck, but I think that the pitcher still generated "value" by getting them.

BB-ref also attempts to adjust for quality of defense in calculating WAR; they use a composite team defensive rating and adjust pitchers with weak defenses up (and vice versa). Jeremy Guthrie's defensive adjustment is currently bringing his rWAR up by 0.18 wins since the O's defense has (according to BB-ref) been 2 runs below average. That makes a lot more sense to me.

I mean, the question of whether Fangraphs or BB-ref made the better choice of defense-independent pitching statistics is legitimate, but it's a lesser of two evils. In my opinion, BB-ref's has way more serious problems. But what there is no question about is this: you must use a pitching metric that factors out the contributions of the defense. If you want to argue that BB-ref's is better, you're more than welcome to, but you have an uphill battle ahead of you. And that argument is a sideshow to your original point, which was that WAR is flawed because it is predictive rather than descriptive. That is incorrect, and the decision to use FIP for the defense-independent pitching metric was a conscious design choice to keep WAR as descriptive as possible.

Link to comment
Share on other sites

[/b]

Im 48 years old and before I ever heard of fielding percentage and WAR, I knew that Paul Blair was a great fielder. I think you fail to realize or just dont want to accept the fact that this game was played LONG before any of these metrics were discovered. And yet somehow the baseball people were able to figure out who the better players were. Are they a tool now, sure. Are they NEEDED? Of course they arent. Are they perfect? Of course they aren't. They are as flawed as an experienced scout or baseball executive. I can take you to my sons Little league tommorrow morning and you could watch one game and probably name the 5-6 best kids on the field. I may be a strange baseball fan in your eyes, but trust me, im closer to the norm then you are. If you can not judge Wieters, Jones or Markakis without looking at stats, well I find that very odd and slightly sad.

I'm not going to spend ton of time here trying to convince you of something you clearly made you mind up on many years ago, and plan to stubbornly refuse to accept any counter-arguments.

But a few bullets:

- How did you know how good Paul Blair was relative to all of the other players in MLB without even knowing the very basics of how they keep track of a player's performance? Probably because he made some nice plays (as do most MLB players) and everyone told you so. How did they know? I'm sure they had inputs from a ton of sources, many of whom might have actually done some homework and used some metrics.

- Yes it's possible to sort out good players from bad solely by subjective observation, but you're not going to be very good at it. Look at the historical MVP voting. Some of the winners weren't among the top 100 players in the league. And that was actually using basic stats, throw them out and I guarantee you would mistake any number of mediocre players for great ones. HOF voting before the Baseball Encyclopedia was often a joke, with players inducted based on legends, myths, and 12th-hand stories.

- Of course metrics are needed. To say otherwise is being obstinate to prove a point. People were inventing metrics in the 1850s and 1860s to keep track of ballplayers. They weren't doing it on a lark.

- Of course you can tell me the 5-6 best players on a little league team. The spread of talent on most teams like that is 1000x greater than that on a MLB team. You have kids in little league who field .085, get three hits in their career, and aren't sure which base is which. And they play alongside guys who'll end up hitting .500 in high school. In MLB almost everyone in the league was the very best player on their team from age 8 to age 19. Tell those guys apart on a meaningful level over 162 games without metrics? Ha! Absurd!

- I find it incredibly naive and arrogant to assume you can tell me with any accuracy whether Markakis or Wieters contributes more wins to the Baltimore Orioles without any metrics.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.


  • Posts

    • Oh, I don't know. I thought when accusing someone of wild malpractice over possibly, maybe, slightly speeding up highlights that kind of opened the door to a little goofy exaggeration.
    • I was going to post something about this after reading about that on MLBTR this morning. That gives me a lot of hope for Bradish if this kid can come back from a UCL sprain and throw 103. Obviously, reliever vs. starter so who knows. But uplifting to read nonetheless. 
    • Hollocher hit almost exclusively 2nd in the order. The Cubs' 3rd hitters (and it was the Cubs, not the Indians as I previously stated) were mostly Marty Krug, Zeb Terry, and John Kelleher. Krug was awful for a 1922 3rd-place hitter, with an 83 OPS+ in his only season as a MLB regular, but he only struck out 43 times in 524 PAs. Terry was worse, OPS+ing 74, but with just 16 Ks in 571 PAs. And Kelleher was the worst of the bunch, OPS+ing 60, while striking out 14 times in 222 PAs. Cubs manager Reindeer Bill Killefer stuck hard and fast to the old rule of thumb that the catcher should bat 8th, even if it's Bob O'Farrell and he hit .324 with an .880 OPS. Ray Grimes had a 1.014 OPS and batted cleanup. But Hack Miller and his .899 OPS batted mostly 6th. Statz wasn't a terrible leadoff hitter, was one of only a couple players who had a SB% higher than 50%, but was 6th among their regulars in OBP. That's as bad a bunch of #3 hitters as I've seen in a while, yet the Cubs finished 80-74-2. Just goes to show you batting order doesn't really matter. Anyway, back to the main point... yes, I'm sure some of Hollocher's CS were busted hit-and-runs. But nobody that regularly batted behind him struck out in even 7% of PAs so they shoulda been putting the ball in play the vast majority of the time.    
    • Bobby needs to git gud. 
    • How many people actually said they were one of the greatest teams ever?   They did hit the snot out of the ball the first 9 games of the year, mostly in a 6 game series in a very hitter-friendly ball park against a bad pitching staff.  That said, they’re still second in the league in runs per game.  Their pitching has been problematic, yielding 6.50 runs per game.  
    • Gunnar’s base running is in the 99th percentile.  That mess is in the 98th percentile.
    • Yeah, the highlighted section here is really why I agree that the O's will look to minimize losing players to waivers just yet. Things could blow up on them pretty quick. There's a ton of risk with these moves, but they have to find out. The best way to do that is to utilize the options for Akin and Tate, IMO. We'll see! 
  • Popular Contributors

×
×
  • Create New...