Jump to content

Super Stat Sticky: Get Your Learn On!


Mashed Potatoes

Recommended Posts

Because on a wide variety of scales and contexts runs created is very close to runs actually scored. On a game level, team level, season level... whatever, you figure out runs created from other events and components and it's going to be close to actual runs scored.

However, when you start tweaking a formula (RP, RPA) based upon team dependent values (runs, RBIs) attempting to develop a team independent assessment of the values of individual players, how do you know if your tweaks really reflect individual player values or are just completely arbitrary?

Link to comment
Share on other sites

But every other run is counted twice -- once as a run scored and once as a run driven in. Why should the run scored by the batter who hits the HR be devalued by 50 percent compared to all other runs scored? If anything, it should be valued more than other runs because it gets scored independently of what any other hitter in the lineup does.

I'm not following.

For an individual batter, when you add RS to RBI, the only thing counted twice is a home run, which goes under both.

When you subtract the number of home runs, you aren't taking all the runs from those home runs out, just one run from each, which was the batter driving himself in.

Link to comment
Share on other sites

I'm not following.

For an individual batter, when you add RS to RBI, the only thing counted twice is a home run, which goes under both.

But the run scored and the RBI also get counted for another player, so they are getting counted twice, just against different players.

There is no other player involved in the run scored by the guy who hit the home run, so that run gets undervalued by the RP formula. To be a mathematically consistent tally of actual runs, the sum of runs scored and RBIs should be divided by two.

The "counted twice" argument for subtracting home runs is a mathematical fallacy. Tango's argument for subtracting home runs isn't to avoid counting the run twice; it's to make the output of the RP formula correlate more closely with the RC formula.

When you subtract the number of home runs, you aren't taking all the runs from those home runs out, just one run from each, which was the batter driving himself in.

Again, the argument is based upon a mathematical fallacy.

If you divide the sum of runs scored and RBIs by two for all of the players on a team and add up those values, you should get the actual runs scored by the team, less those runs for which no RBI is awarded (scored on DP or WP).

Once you accept that there's no mathematical justification for subtracting home runs, then the test becomes whether the results provide a valid measure of the run scoring value of a hitter. Tango's test is that the results are close to those obtained by using RC. Tango also proposes additional adjustments beyond subtracting out home runs to make the results correlae even more closely with RC.

Link to comment
Share on other sites

  • 3 months later...

Beyond the Box Score has a post up introducing what they call JustVORP (presumably named after the guy who designed it, whose name is Justin), which aims to correct some of the problems with VORP.

It's pretty interesting - I didn't realize the strange way that replacement level is calculated in regular VORP. The level is set by taking a percentage of the average offensive production at each position, which is an unfortunate choice because DHs on average hit worse than 1B-men, making it seem as if DH is a more difficult defensive position.

The JustVORP stat uses the defensive spectrum of DH-1B-....-SS-CA, and the associated adjustments. Also, it claims to correct for both park and league differences.

I'm not sure what correcting for league differences means. Judging by the results, it sounds like they established a separate replacement level for each league, which is lower in the NL. I say this because the top of the rankings are full of NL players. There are only three AL players in the overall top 10, and that's if you count Manny Ramirez (not sure how they dealt with him).

Also, the list shines pretty favorably on a couple of Orioles - here is the top of the AL rankings (not including Ramirez/Teixeira, who scored 68/53 respectively):

1. Sizemore - 66

2. A-Rod - 62

3. Mauer - 54

4. Pedroia - 53

5. Roberts - 52

5. Markakis - 52

Link to comment
Share on other sites

... I'm not sure what correcting for league differences means. Judging by the results' date=' it sounds like they established a separate replacement level for each league, which is lower in the NL. I say this because the top of the rankings are full of NL players. There are only three AL players in the overall top 10, and that's if you count Manny Ramirez (not sure how they dealt with him).[/quote']

I think it means that they're adjusting the production of NL players down by some factor which relates to the perceived average level of ability in the NL versus the AL. In other words, those NL players who dominate the top 10 would rate even higher if not for the league factor adjustment. However, if you look at the players in the bottom half of the overall group, you will probably find a significantly higher number of NL players in that bottom half than AL players.

Replacement level players are cheap, so the skill level of replacement players ought to be similar for either league. The only way that they would be significantly different would be if one league had substantially superior farm systems than the other league.

However, in reading more carefully, I do see that they've defined replacement level differently for each league, 73 percent vs 78 percent. I think that means that they've normalized replacement level to correspond to the average performance level of each league, but I'm not completely sure. In other words, "replacement level" is approximately equal for either league, but that value of performance corresponds to 73 percent of average offensive production in the stronger American League and 78 percent of average offensive production in the weaker National League.

NL pitchers don't have to face the DH, except for interleague games, so the NL pitchers ought to post higher strikeout rates, lower walk rates, lower WHIP, and lower ERAs -- on average. This would tend to depress the average number of runs scored and driven in by NL hitters.

The other factor which affects the level of ability in the 2 leagues is the disparity in team payrolls. We know that payrolls correlate somewhat weakly with actual production, but there is a correlation, and that correlation increases when taken over the entire group of players in a league, instead of just a single team. The average payroll in the AL was over $13 million higher than the average NL payroll in 2008, and that payroll disparity naturally equates to slightly stronger teams in the junior circuit. This means that the average 5th starter on an AL team will likely be a little better than the average 5th starter on an NL team, and likewise for the hitters at the bottom end of the batting order and on the bench.

Link to comment
Share on other sites

  • 2 months later...
  • 10 months later...
  • 2 weeks later...
Is there a reason why WAR is more popular than win shares? Is one more reliable than the other?

No. The calculations to arrive at these numbers that sort of mean the same thing are quite different. However, take them with a grain of salt. There are a lot of assumptions made in the calculations. Is a guy with a 4.5 WAR more valuable than a guy with a 4.2 WAR? The answer is maybe.

Link to comment
Share on other sites

Is there a reason why WAR is more popular than win shares? Is one more reliable than the other?

Yes, absolutely. Win Shares underrates modern starting pitchers, overrates 1800s starting pitchers, doesn't set a replacement level, and even Bill James admits is only a halfway solution without the addition of Loss Shares.

Link to comment
Share on other sites

  • 2 weeks later...
  • 3 years later...
Is there a way to find league average wOBA by position per year?

Thanks in advance.

On Fangraphs you can easily get there. Just go to team stats, then league stats at the top of the page, then batting, then pick the year and position, and it shows totals for that position/league/year. For 2012 catchers it's .312. 1983 catchers it's .309. For 1933 it's .323. For 1880 it's .238.

Link to comment
Share on other sites

On Fangraphs you can easily get there. Just go to team stats, then league stats at the top of the page, then batting, then pick the year and position, and it shows totals for that position/league/year. For 2012 catchers it's .312. 1983 catchers it's .309. For 1933 it's .323. For 1880 it's .238.

Thanks as always Drungo.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...