Jump to content

Fangraphs: The Orioles and Accepting Random Variation


Can_of_corn

Recommended Posts

I don't see how this is disagreeing with him. You could easily say "The 2013 team was worse in 1 run games because their bullpen regressed to the mean and was worse and their closer regressed to the mean and blew 9 saves.

And he wasn't giving the 2013 O's as "proof" so much as an example of how extreme outliers usually regress.

That's what I'm arguing against. I don't see the two bullpens as at the same level and just seeing worse results in 2013, because they lived up the projections. I see the 2013 bullpen as being worse. For example, do you think that Jim Johnson was the exact same pitcher in 2012 and 2013 and just "regressed to the norm" in 2013? I don't. I see him as pitching worse.

And he certainly seemed to be using the 2013 Orioles as proof. That's exactly what he says. "The 2012 did this and that and yada yada yada but there's no evidence that this is a repeatable skill...yada yada yada. And sure enough the 2013 Orioles did worse. They defied the odds one year but did exactly how we expected the following year...yada yada yada".

Link to comment
Share on other sites

  • Replies 88
  • Created
  • Last Reply
Sure it's valid. You just need to accept that in baseball random stuff makes up a fairly large percentage of the differences between teams. If every single team was a exact copy of every single other team you'd still observe teams winning 70 and other teams winning 90.

I think I know which of the clone wars teams Buck would be managing.

Link to comment
Share on other sites

Sure it's valid. You just need to accept that in baseball random stuff makes up a fairly large percentage of the differences between teams. If every single team was a exact copy of every single other team you'd still observe teams winning 70 and other teams winning 90.

Everyone understands this but what Wildcard is saying is that the models seem to be saying, "Team A is projected to win 73 games" and then when Team A when 93 games, the model guy goes, "Well yeah but we're still right because that stuff can happen...".

Link to comment
Share on other sites

Everyone understands this but what Wildcard is saying is that the models seem to be saying, "Team A is projected to win 73 games" and then when Team A when 93 games, the model guy goes, "Well yeah but we're still right because that stuff can happen...".

I don't see anything wrong with that statement. The projection is just a most likely scenario. It's not an ironclad promise. But people act like everytime a statistical projection is off, that somehow the entire system behind it is bunk. Just because something is likely to happen, doesn't mean it's going to.

If Team A is projected to win 73 games, what does that mean? They have a 60% chance of hitting that number? And maybe 10% chance of winning 93? Well, sooner or later, that 10% chance is going to happen.

Link to comment
Share on other sites

I don't see anything wrong with that statement. The projection is just a most likely scenario. It's not an ironclad promise. But people act like everytime a statistical projection is off, that somehow the entire system behind it is bunk. Just because something is likely to happen, doesn't mean it's going to.

If Team A is projected to win 73 games, what does that mean? They have a 60% chance of hitting that number? And maybe 10% chance of winning 93? Well, sooner or later, that 10% chance is going to happen.

Or, the system may be flawed. Either one.

Link to comment
Share on other sites

When your statistical model has an expected variation of 10-20 wins, I'm not sure what the value of it is.

Honestly, local sportswriters have been as good (or better) at predicting the standings than Fangraphs for several years now. And the prediction models like ZIPS or whatever are no better than just jotting down the previous years stats as the prediction for the coming year.

In major league baseball a perfect model would be about +/-6 wins. That's as good as you can do. So 10 is pretty good. If your model has an average error of less than 10 you're probably just lucky.

Link to comment
Share on other sites

When your statistical model has an expected variation of 10-20 wins, I'm not sure what the value of it is.

Honestly, local sportswriters have been as good (or better) at predicting the standings than Fangraphs for several years now. And the prediction models like ZIPS or whatever are no better than just jotting down the previous years stats as the prediction for the coming year.

That's what I feel is the problem with so many of the statistical models that are currently used in baseball.

In any statistical model that is used to make predictions, there is error. This error can actually be calculated, and the data is meaningless if the error is not included.

When you report data, you also must specify a confidence interval. For example, if you develop a statistical model that predicts the mass of a piece of equipment made by machinery that is normally distributed, you might report it as "there's a 95% chance that the average mass will be within 45.3 and 46.7 grams." This data would have a mean of 46.0 and error of +/- 0.7

Error increases when there is a large standard deviation (as there often is with baseball data) and when there is a small sample size (as there often is with baseball data), and when you assume data that is not normally distributed (as baseball data normally is not) is normally distributed. Reporting most statistical model data related to baseball with the associated error would almost be absurd, because the error would be astronomical compared to the scale of the actual data.

Link to comment
Share on other sites

That part particular bothered me. He says that Scioscia was able to beat the projection for 5 years, but then dismisses that he had any thing to do with it because he "forgot" how to do it after that. Which ignores the many, many factors outside his control. Maybe Scioscia built a team with great defense, lots of home runs, and a great bullpen, and then his relivers got hurt, his power hitters left as free agents, and his best defenders retired. Dismissing the manager as a factor is pretty stupid without looking at how the composition of the team changed over time, and how that may have affected the team's performance.

Yeah, that was the major issue I had with his conclusion. On a related note, the O's would have been much closer to 93 wins last year if Johnson hadn't fallen apart on them. We could easily be looking at a three year run in which the O's outplay these projections.

If this were a real business, Cameron would get fired for this type of explanation.

Link to comment
Share on other sites

Everyone understands this but what Wildcard is saying is that the models seem to be saying, "Team A is projected to win 73 games" and then when Team A when 93 games, the model guy goes, "Well yeah but we're still right because that stuff can happen...".

The best models will still be wrong some of the time. The best model isn't the one that predicts every single record correctly - that's impossible. That would just be extremely lucky. You have to accept that all models, even perfect ones, have the possiblity of being off by 20 wins for any one team.

Link to comment
Share on other sites

The best models will still be wrong some of the time. The best model isn't the one that predicts every single record correctly - that's impossible. That would just be extremely lucky. You have to accept that all models, even perfect ones, have the possiblity of being off by 20 wins for any one team.
Where did they have the Red Sox? That would be two 20 win wrong projections. Right?
Link to comment
Share on other sites

Or, the system may be flawed. Either one.

The point is, you can't look at one result, and say "Well, the Orioles did better than the projections, clearly the system is flawed." That was what Cameron was getting at. Some outliers are to be expected. I just think he was very rigourous in this case, because the results he got matched with his preconceived notions.

Link to comment
Share on other sites

That's what I feel is the problem with so many of the statistical models that are currently used in baseball.

In any statistical model that is used to make predictions, there is error. This error can actually be calculated, and the data is meaningless if the error is not included.

When you report data, you also must specify a confidence interval. For example, if you develop a statistical model that predicts the mass of a piece of equipment made by machinery that is normally distributed, you might report it as "there's a 95% chance that the average mass will be within 45.3 and 46.7 grams." This data would have a mean of 46.0 and error of +/- 0.7

Error increases when there is a large standard deviation (as there often is with baseball data) and when there is a small sample size (as there often is with baseball data), and when you assume data that is not normally distributed (as baseball data normally is not) is normally distributed. Reporting most statistical model data related to baseball with the associated error would almost be absurd, because the error would be astronomical compared to the scale of the actual data.

I think those who are not well-versed in statistics don't have an implied margin of error, so they take a 93-win prediction as a 93-win prediction. When in reality that prediction is saying something like "this team has a 68% chance of being between 86 and 97 wins".

Link to comment
Share on other sites

The point is, you can't look at one result, and say "Well, the Orioles did better than the projections, clearly the system is flawed." That was what Cameron was getting at. Some outliers are to be expected. I just think he was very rigourous in this case, because the results he got matched with his preconceived notions.

I never said that. I have seen them be wrong on us since Buck has been around.

Link to comment
Share on other sites

The point is, you can't look at one result, and say "Well, the Orioles did better than the projections, clearly the system is flawed." That was what Cameron was getting at. Some outliers are to be expected. I just think he was very rigourous in this case, because the results he got matched with his preconceived notions.

I gave it a flip of the coin chance of going your way on this.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.




  • Posts

    • I thought Chisholm missed home too. They didn’t appeal tho I don’t think.
    • Now run scores and Yanks take lead. 
    • Inexcusable missed call in NY. Review Cleo at showed Chisholm out at 2nd. They upheld safe call. 
    • Well, good on posters who proved the SSS side of "Guards Ball." I just found it striking in terms of the narrative in that article, which was basically the same as what most around here were complaining the O's lacked: clutch hitting, passing the baton, aggressive running, getting runners in from third, etc. I guess the real bottom line is "whatever works." Which of course varies from case to case. The old Bill James postseason wisdom was that HRs are the ticket, since you face good pitching and get so few hits. So back to you, Elias, keep crunching those numbers...
    • First, the had a jump in 23’ given how terrible they had been previously, which conditions many fans in the marketplace not to care. They simple weren’t relevant for years. So one very good regular season will not undue years of being bad/irrelevant and treating your customers terribly. Next, I think they missed an opportunity in the offseason by not doing enough by way of big/bold attention grabbing moves. Now I acknowledge that this was most likely due to the ownership flux/transition. I believe they got an attendance/marketplace engagement boost when they changed owners and when they traded for Burnes. However, I believe we would have seen more engagement attendance with say a big Gunnar extension and/or bringing in a big time FA.   IMO this would have created more buzz before the season (say around the time people make season tix decisions - IMO before Christmas is when some people make those bigger purchases). All of this is to say, that it will take time and effort on the organizations part because of how bad of a stain that the Angeloses left. I still have friends and colleagues who refuse to support the Orioles and attend games due to the damage that was done. Rubenstien & co are not going to be able to undo 30 years of awfulness overnight. But IMO it is not enough to simply call it “a new chapter”. They have to make new/different actions to distinguish themselves from who the Orioles were/used to be under the Angelos regime.
    • Just checking in on Gameday, Yankees looking incredibly vulnerable.  Should be the Os out there.  Super lame.  Whichever team wins this series I hope gets knocked out by CLE or DET.
    • If the franchise were better, the fan base would be too.  It’s been a rough 40 years.
  • Popular Contributors

×
×
  • Create New...