Jump to content

Fangraphs: The Orioles and Accepting Random Variation


Can_of_corn

Recommended Posts

  • Replies 88
  • Created
  • Last Reply
This comment at the bottom of the article sums it up pretty well for me. Basically, just because outliers exist, doesn't mean we shouldn't be asking the question of why certain outliers exist. Especially when a team has been doing it for three years in a row.

Also, a point that I think is kind of lost in this is that the O's are only outperforming their Pythag by three games this year, meaning they'd still be in first place even if they were performing exactly as expected.

This is exactly what I was saying earlier. In 2012, people like Cameron were pointing to Pythag records to show why the Orioles overachieved. (Our Pythag in 2012 was 82-80.) Obviously he can't do that now, since their Pythag only has them 3 games better than what they "should be". So that means he has to come up with some other reason as to why he was wrong about the Orioles. He chose "random variation".

You can't make this stuff up.

EDIT: Here's a link to his chat session with commenters. Lots of Orioles related questions.

http://www.fangraphs.com/blogs/dave-cameron-fangraphs-chat-81314/

Link to comment
Share on other sites

These models that seek to apply historical run differentials (essentially average margins of victory) are best used as a forecasting tool prior to a season than during a season.

I really don't believe all runs in a season are created equal. These attempts to explain why a team's actual average margin of victory is different from the historical average are often draw improper conclusions IMO.

Link to comment
Share on other sites

I don't think it's reasonable to assume that everything you don't understand gets thrown in a bucket called 'random variation.' Many things that follow a random distribution given a large enough sample have a causal factor for individual outliers. Cameron didn't attempt to show that managers don't have year-over-year correlation in ability to do better on baseruns - he just dismissed the argument because Mike Scoscia was on a 3-year negative streak with it.

I don't disagree. But people are prone to assigning a cause to everything, even that which just happens. We like to think we (or someone) is in control.

Link to comment
Share on other sites

That's the thing. If that's the case, you can just say every team will win about 78 games and be right almost every single time (within the margin of error).

This, in a nutshell, is what is wrong with Cameron's argument for me.

I have seen after the season rankings of preseason predictions where just picking everyone to go 81-81 was somewhere around the middle of the pack.

Link to comment
Share on other sites

One thing that does annoy me about all the advanced stats saying that we shouldn't be good is that it ends up being like that in the baseball stat game, Out of the Park. I still haven't been able to get the O's over a winning record for the first season any of the last 3 years since their advanced stat guys rate all our players low.

Link to comment
Share on other sites

The guy proved that his model works really well. But he did a poor job of explaining much else. It seems from a quick look at the data that teams who were outliers before are more prone to being outliers again. This should not be the case in a truly normal distribution.

Link to comment
Share on other sites

The guy proved that his model works really well. But he did a poor job of explaining much else. It seems from a quick look at the data that teams who were outliers before are more prone to being outliers again. This should not be the case in a truly normal distribution.

That's because when you have rolling 3-year spans you have overlap in time periods. A single season can thus influence 3 different years.

Link to comment
Share on other sites

That's because when you have rolling 3-year spans you have overlap in time periods. A single season can thus influence 3 different years.

The whole point Cameron was making though is that its all random and doesn't matter how the team is built, which is obviously not really the case since two different teams have gone against the projections in multiple years. Something about those two teams is confusing the projections.

Link to comment
Share on other sites

I don't disagree. But people are prone to assigning a cause to everything, even that which just happens. We like to think we (or someone) is in control.

I just think it conveys the wrong message to call it (or anything else) random variation when we haven't actually developed proof that it is in fact random at an individual level. It confuses the high-level perspective (in which random variation is normal and expected) with the individual perspective (in which individual data points are likely there due to causal factors, whether they are known or not.) You're right that wildly assigning cause is incorrect. but in this case, by attributing it to randomness, he's wildly assigning cause as well.

Link to comment
Share on other sites

The whole point Cameron was making though is that its all random and doesn't matter how the team is built, which is obviously not really the case since two different teams have gone against the projections in multiple years. Something about those two teams is confusing the projections.

Right, and this is a flaw in the usage of rolling 3 year seasons. An outlier season in 2010 will show up in 2010, 2011, and 2012 data, which will increase the probability that adjacent seasons to an outlier will show up as outliers as well.

Link to comment
Share on other sites

One thing that does annoy me about all the advanced stats saying that we shouldn't be good is that it ends up being like that in the baseball stat game, Out of the Park. I still haven't been able to get the O's over a winning record for the first season any of the last 3 years since their advanced stat guys rate all our players low.

What annoys me is how seldom the people who use the stats make comments showing they know about stats, probability, etc. Only regression. Instead their comments come out like sneers at the Orioles. And yet I don't recall anybody writing/talking about how the Angels were unreasonably lucky and didn't deserve to win like they seemed to say about us in 2012.

But why wait for Fangraphs to do it? Some of our Stat experts could compare the 9 examples and see if the teams have anything in common. If so, send it to Fangraphs. Maybe the outliers on the other end have similar flaws.

Sent from my Kindle Fire using Tapatalk 2

Link to comment
Share on other sites

I think a common flaw people make is applying a statistical model to predict what happens to just one team, or just one season, or just one coin flip.

This guys model works pretty well to predict how wins will correlate to runs across all of baseball over a full season (or multiple seasons).

It says nothing at all about which team will be the outlier in a given time frame. It just says there should be an outlier or 2 every season.

Luckily for us, the Orioles have been outliers 2 out of 3 years.

But isn't that same sin being committed by both those saying the model is wrong because the O's don't fit it AND those who keep saying the O's are "lucky" or "bound to regress." Maybe the model works on the whole, but I feel like the folly of using a one team sample to evaluate it really cuts both ways.

Also, the O's may be an outlier but that doesn't mean necessarily anything regarding their true talent. Geniuses are an outlier on a distribution of IQ scores, but it doesn't make them flukey , lucky or bound to regress.

Link to comment
Share on other sites

But isn't that same sin being committed by both those saying the model is wrong because the O's don't fit it AND those who keep saying the O's are "lucky" or "bound to regress." Maybe the model works on the whole, but I feel like the folly of using a one team sample to evaluate it really cuts both ways.

Also, the O's may be an outlier but that doesn't mean necessarily anything regarding their true talent. Geniuses are an outlier on a distribution of IQ scores, but it doesn't make them flukey , lucky or bound to regress.

There's the rub. The models account for outliers, but they don't identify/predict specific outliers and/or explain why those particular outliers exist. The O's might be lucky this year. Then again, the collection of guys who comprise the 2014 team (which is not identical to the rosters of teams spread across any three year sample) might have "figured something out" that buggers the numbers.

Whatever the truth of the matter, this team plays hard, and it's shown a heckuva lot more talent/resilience than I gave it credit for early in the season. Fun to watch.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.


×
×
  • Create New...