Fangraphs: The Orioles and Accepting Random Variation

isestrex · August 14, 2014

He got into this stuff in his weekly podcast on Monday

http://www.fangraphs.com/blogs/fangraphs-audio-dave-cameron-analyzes-all-dog-days/

Skip to 17:38. He's talking about the Royals' lucky streak but that segways quickly into the Orioles' success.

incubus · August 14, 2014

When your statistical model has an expected variation of 10-20 wins, I'm not sure what the value of it is.

LOL winner.

LocoChris · August 14, 2014

This comment at the bottom of the article sums it up pretty well for me. Basically, just because outliers exist, doesn't mean we shouldn't be asking the question of why certain outliers exist. Especially when a team has been doing it for three years in a row.
Also, a point that I think is kind of lost in this is that the O's are only outperforming their Pythag by three games this year, meaning they'd still be in first place even if they were performing exactly as expected.

This is exactly what I was saying earlier. In 2012, people like Cameron were pointing to Pythag records to show why the Orioles overachieved. (Our Pythag in 2012 was 82-80.) Obviously he can't do that now, since their Pythag only has them 3 games better than what they "should be". So that means he has to come up with some other reason as to why he was wrong about the Orioles. He chose "random variation".

You can't make this stuff up.

EDIT: Here's a link to his chat session with commenters. Lots of Orioles related questions.

http://www.fangraphs.com/blogs/dave-cameron-fangraphs-chat-81314/

hoosiers · August 14, 2014

These models that seek to apply historical run differentials (essentially average margins of victory) are best used as a forecasting tool prior to a season than during a season.

I really don't believe all runs in a season are created equal. These attempts to explain why a team's actual average margin of victory is different from the historical average are often draw improper conclusions IMO.

DrungoHazewood · August 14, 2014

I don't think it's reasonable to assume that everything you don't understand gets thrown in a bucket called 'random variation.' Many things that follow a random distribution given a large enough sample have a causal factor for individual outliers. Cameron didn't attempt to show that managers don't have year-over-year correlation in ability to do better on baseruns - he just dismissed the argument because Mike Scoscia was on a 3-year negative streak with it.

I don't disagree. But people are prone to assigning a cause to everything, even that which just happens. We like to think we (or someone) is in control.

DrungoHazewood · August 14, 2014

That's the thing. If that's the case, you can just say every team will win about 78 games and be right almost every single time (within the margin of error).
This, in a nutshell, is what is wrong with Cameron's argument for me.

I have seen after the season rankings of preseason predictions where just picking everyone to go 81-81 was somewhere around the middle of the pack.

AlbionHero · August 14, 2014

One thing that does annoy me about all the advanced stats saying that we shouldn't be good is that it ends up being like that in the baseball stat game, Out of the Park. I still haven't been able to get the O's over a winning record for the first season any of the last 3 years since their advanced stat guys rate all our players low.

Orioles2012 · August 14, 2014

The guy proved that his model works really well. But he did a poor job of explaining much else. It seems from a quick look at the data that teams who were outliers before are more prone to being outliers again. This should not be the case in a truly normal distribution.

Hallas · August 14, 2014

The guy proved that his model works really well. But he did a poor job of explaining much else. It seems from a quick look at the data that teams who were outliers before are more prone to being outliers again. This should not be the case in a truly normal distribution.

That's because when you have rolling 3-year spans you have overlap in time periods. A single season can thus influence 3 different years.

AlbionHero · August 14, 2014

That's because when you have rolling 3-year spans you have overlap in time periods. A single season can thus influence 3 different years.

The whole point Cameron was making though is that its all random and doesn't matter how the team is built, which is obviously not really the case since two different teams have gone against the projections in multiple years. Something about those two teams is confusing the projections.

Hallas · August 14, 2014

I don't disagree. But people are prone to assigning a cause to everything, even that which just happens. We like to think we (or someone) is in control.

I just think it conveys the wrong message to call it (or anything else) random variation when we haven't actually developed proof that it is in fact random at an individual level. It confuses the high-level perspective (in which random variation is normal and expected) with the individual perspective (in which individual data points are likely there due to causal factors, whether they are known or not.) You're right that wildly assigning cause is incorrect. but in this case, by attributing it to randomness, he's wildly assigning cause as well.

Hallas · August 14, 2014

The whole point Cameron was making though is that its all random and doesn't matter how the team is built, which is obviously not really the case since two different teams have gone against the projections in multiple years. Something about those two teams is confusing the projections.

Right, and this is a flaw in the usage of rolling 3 year seasons. An outlier season in 2010 will show up in 2010, 2011, and 2012 data, which will increase the probability that adjacent seasons to an outlier will show up as outliers as well.

Pheasants · August 14, 2014

One thing that does annoy me about all the advanced stats saying that we shouldn't be good is that it ends up being like that in the baseball stat game, Out of the Park. I still haven't been able to get the O's over a winning record for the first season any of the last 3 years since their advanced stat guys rate all our players low.

What annoys me is how seldom the people who use the stats make comments showing they know about stats, probability, etc. Only regression. Instead their comments come out like sneers at the Orioles. And yet I don't recall anybody writing/talking about how the Angels were unreasonably lucky and didn't deserve to win like they seemed to say about us in 2012.

But why wait for Fangraphs to do it? Some of our Stat experts could compare the 9 examples and see if the teams have anything in common. If so, send it to Fangraphs. Maybe the outliers on the other end have similar flaws.

Sent from my Kindle Fire using Tapatalk 2

BohKnowsBmore · August 14, 2014

I think a common flaw people make is applying a statistical model to predict what happens to just one team, or just one season, or just one coin flip.
This guys model works pretty well to predict how wins will correlate to runs across all of baseball over a full season (or multiple seasons).

It says nothing at all about which team will be the outlier in a given time frame. It just says there should be an outlier or 2 every season.

Luckily for us, the Orioles have been outliers 2 out of 3 years.

But isn't that same sin being committed by both those saying the model is wrong because the O's don't fit it AND those who keep saying the O's are "lucky" or "bound to regress." Maybe the model works on the whole, but I feel like the folly of using a one team sample to evaluate it really cuts both ways.

Also, the O's may be an outlier but that doesn't mean necessarily anything regarding their true talent. Geniuses are an outlier on a distribution of IQ scores, but it doesn't make them flukey , lucky or bound to regress.

MrOrange82 · August 14, 2014

But isn't that same sin being committed by both those saying the model is wrong because the O's don't fit it AND those who keep saying the O's are "lucky" or "bound to regress." Maybe the model works on the whole, but I feel like the folly of using a one team sample to evaluate it really cuts both ways.
Also, the O's may be an outlier but that doesn't mean necessarily anything regarding their true talent. Geniuses are an outlier on a distribution of IQ scores, but it doesn't make them flukey , lucky or bound to regress.

There's the rub. The models account for outliers, but they don't identify/predict specific outliers and/or explain why those particular outliers exist. The O's might be lucky this year. Then again, the collection of guys who comprise the 2014 team (which is not identical to the rosters of teams spread across any three year sample) might have "figured something out" that buggers the numbers.

Whatever the truth of the matter, this team plays hard, and it's shown a heckuva lot more talent/resilience than I gave it credit for early in the season. Fun to watch.

Sign In

Fangraphs: The Orioles and Accepting Random Variation

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Archived

Who's Online 90 Members, 1 Anonymous, 525 Guests (See full list)

Orioles News and Information

Orioles Prospect Information

Statistics

Posts

Popular Contributors