Jump to content

Fangraphs: The Orioles and Accepting Random Variation


Can_of_corn

Recommended Posts

I don't see how this is disagreeing with him. You could easily say "The 2013 team was worse in 1 run games because their bullpen regressed to the mean and was worse and their closer regressed to the mean and blew 9 saves.

And he wasn't giving the 2013 O's as "proof" so much as an example of how extreme outliers usually regress.

That's what I'm arguing against. I don't see the two bullpens as at the same level and just seeing worse results in 2013, because they lived up the projections. I see the 2013 bullpen as being worse. For example, do you think that Jim Johnson was the exact same pitcher in 2012 and 2013 and just "regressed to the norm" in 2013? I don't. I see him as pitching worse.

And he certainly seemed to be using the 2013 Orioles as proof. That's exactly what he says. "The 2012 did this and that and yada yada yada but there's no evidence that this is a repeatable skill...yada yada yada. And sure enough the 2013 Orioles did worse. They defied the odds one year but did exactly how we expected the following year...yada yada yada".

Link to comment
Share on other sites

  • Replies 88
  • Created
  • Last Reply
Sure it's valid. You just need to accept that in baseball random stuff makes up a fairly large percentage of the differences between teams. If every single team was a exact copy of every single other team you'd still observe teams winning 70 and other teams winning 90.

I think I know which of the clone wars teams Buck would be managing.

Link to comment
Share on other sites

Sure it's valid. You just need to accept that in baseball random stuff makes up a fairly large percentage of the differences between teams. If every single team was a exact copy of every single other team you'd still observe teams winning 70 and other teams winning 90.

Everyone understands this but what Wildcard is saying is that the models seem to be saying, "Team A is projected to win 73 games" and then when Team A when 93 games, the model guy goes, "Well yeah but we're still right because that stuff can happen...".

Link to comment
Share on other sites

Everyone understands this but what Wildcard is saying is that the models seem to be saying, "Team A is projected to win 73 games" and then when Team A when 93 games, the model guy goes, "Well yeah but we're still right because that stuff can happen...".

I don't see anything wrong with that statement. The projection is just a most likely scenario. It's not an ironclad promise. But people act like everytime a statistical projection is off, that somehow the entire system behind it is bunk. Just because something is likely to happen, doesn't mean it's going to.

If Team A is projected to win 73 games, what does that mean? They have a 60% chance of hitting that number? And maybe 10% chance of winning 93? Well, sooner or later, that 10% chance is going to happen.

Link to comment
Share on other sites

I don't see anything wrong with that statement. The projection is just a most likely scenario. It's not an ironclad promise. But people act like everytime a statistical projection is off, that somehow the entire system behind it is bunk. Just because something is likely to happen, doesn't mean it's going to.

If Team A is projected to win 73 games, what does that mean? They have a 60% chance of hitting that number? And maybe 10% chance of winning 93? Well, sooner or later, that 10% chance is going to happen.

Or, the system may be flawed. Either one.

Link to comment
Share on other sites

When your statistical model has an expected variation of 10-20 wins, I'm not sure what the value of it is.

Honestly, local sportswriters have been as good (or better) at predicting the standings than Fangraphs for several years now. And the prediction models like ZIPS or whatever are no better than just jotting down the previous years stats as the prediction for the coming year.

In major league baseball a perfect model would be about +/-6 wins. That's as good as you can do. So 10 is pretty good. If your model has an average error of less than 10 you're probably just lucky.

Link to comment
Share on other sites

When your statistical model has an expected variation of 10-20 wins, I'm not sure what the value of it is.

Honestly, local sportswriters have been as good (or better) at predicting the standings than Fangraphs for several years now. And the prediction models like ZIPS or whatever are no better than just jotting down the previous years stats as the prediction for the coming year.

That's what I feel is the problem with so many of the statistical models that are currently used in baseball.

In any statistical model that is used to make predictions, there is error. This error can actually be calculated, and the data is meaningless if the error is not included.

When you report data, you also must specify a confidence interval. For example, if you develop a statistical model that predicts the mass of a piece of equipment made by machinery that is normally distributed, you might report it as "there's a 95% chance that the average mass will be within 45.3 and 46.7 grams." This data would have a mean of 46.0 and error of +/- 0.7

Error increases when there is a large standard deviation (as there often is with baseball data) and when there is a small sample size (as there often is with baseball data), and when you assume data that is not normally distributed (as baseball data normally is not) is normally distributed. Reporting most statistical model data related to baseball with the associated error would almost be absurd, because the error would be astronomical compared to the scale of the actual data.

Link to comment
Share on other sites

That part particular bothered me. He says that Scioscia was able to beat the projection for 5 years, but then dismisses that he had any thing to do with it because he "forgot" how to do it after that. Which ignores the many, many factors outside his control. Maybe Scioscia built a team with great defense, lots of home runs, and a great bullpen, and then his relivers got hurt, his power hitters left as free agents, and his best defenders retired. Dismissing the manager as a factor is pretty stupid without looking at how the composition of the team changed over time, and how that may have affected the team's performance.

Yeah, that was the major issue I had with his conclusion. On a related note, the O's would have been much closer to 93 wins last year if Johnson hadn't fallen apart on them. We could easily be looking at a three year run in which the O's outplay these projections.

If this were a real business, Cameron would get fired for this type of explanation.

Link to comment
Share on other sites

Everyone understands this but what Wildcard is saying is that the models seem to be saying, "Team A is projected to win 73 games" and then when Team A when 93 games, the model guy goes, "Well yeah but we're still right because that stuff can happen...".

The best models will still be wrong some of the time. The best model isn't the one that predicts every single record correctly - that's impossible. That would just be extremely lucky. You have to accept that all models, even perfect ones, have the possiblity of being off by 20 wins for any one team.

Link to comment
Share on other sites

The best models will still be wrong some of the time. The best model isn't the one that predicts every single record correctly - that's impossible. That would just be extremely lucky. You have to accept that all models, even perfect ones, have the possiblity of being off by 20 wins for any one team.
Where did they have the Red Sox? That would be two 20 win wrong projections. Right?
Link to comment
Share on other sites

Or, the system may be flawed. Either one.

The point is, you can't look at one result, and say "Well, the Orioles did better than the projections, clearly the system is flawed." That was what Cameron was getting at. Some outliers are to be expected. I just think he was very rigourous in this case, because the results he got matched with his preconceived notions.

Link to comment
Share on other sites

That's what I feel is the problem with so many of the statistical models that are currently used in baseball.

In any statistical model that is used to make predictions, there is error. This error can actually be calculated, and the data is meaningless if the error is not included.

When you report data, you also must specify a confidence interval. For example, if you develop a statistical model that predicts the mass of a piece of equipment made by machinery that is normally distributed, you might report it as "there's a 95% chance that the average mass will be within 45.3 and 46.7 grams." This data would have a mean of 46.0 and error of +/- 0.7

Error increases when there is a large standard deviation (as there often is with baseball data) and when there is a small sample size (as there often is with baseball data), and when you assume data that is not normally distributed (as baseball data normally is not) is normally distributed. Reporting most statistical model data related to baseball with the associated error would almost be absurd, because the error would be astronomical compared to the scale of the actual data.

I think those who are not well-versed in statistics don't have an implied margin of error, so they take a 93-win prediction as a 93-win prediction. When in reality that prediction is saying something like "this team has a 68% chance of being between 86 and 97 wins".

Link to comment
Share on other sites

The point is, you can't look at one result, and say "Well, the Orioles did better than the projections, clearly the system is flawed." That was what Cameron was getting at. Some outliers are to be expected. I just think he was very rigourous in this case, because the results he got matched with his preconceived notions.

I never said that. I have seen them be wrong on us since Buck has been around.

Link to comment
Share on other sites

The point is, you can't look at one result, and say "Well, the Orioles did better than the projections, clearly the system is flawed." That was what Cameron was getting at. Some outliers are to be expected. I just think he was very rigourous in this case, because the results he got matched with his preconceived notions.

I gave it a flip of the coin chance of going your way on this.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.


  • Posts

    • I just got tickets for Monday too. I am ok with Irvin as I thought he pitched pretty good at Boston. Plus the MInn pitcher has a 9 ERA. lol (I saw Grod a few times last year. He's cool) I also signed up for the flex membership plan today!   
    • No.  Bunting, obviously, isn’t as easy as everyone thinks.  I learned the proper technique in HS and thought it was pretty easy.  Of course, I wasn’t squaring up against 95+ mph fastballs.    The Red Sox gave us an out, I believe, on Wednesday night.   The worst bunt attempts I’ve ever seen.   The technique is not complicated.  I’m sure every player in MLB has been shown and even had bunting practice at some point.   The worst managerial move in baseball is asking a player who can’t bunt to bunt.   It’s amazing to me that Cora didn’t take the bunt off after seeing how pitiful the first attempt was.
    • I disagree with the implication that not bunting=selfish. The average fan grossly underestimates how difficult it is to bunt a Major League pitch. Besides, the disappearance of bunt plays has more to do with the fact that it's not a good baseball play in most situations, analytically speaking.
    • I had hoped that, with so many young players starting or about to start their ML careers, some of them would learn how to bunt in a close game (seemingly all of them) when there's a need to move a runner, including scoring one from third. I long ago accepted, reluctantly, the fact that veterans who can't bunt won't learn even if adding that skill  potentially would help the team. Is there a chance that the incoming generation of Orioles, who we're told, and I believe, have a we-rather-than-me mentality and want to do whatever it takes to win, are willing and able to learn how to bunt.
    • So the Pythagorean winning percentage would be 72%. So we should be 7-3.   
    • So far in 2024 they are both 27.6 ft/sec sprint speed Colton Cowser Stats: Statcast, Visuals & Advanced Metrics | baseballsavant.com (mlb.com) Cedric Mullins Stats: Statcast, Visuals & Advanced Metrics | baseballsavant.com (mlb.com) Cowser has a very strong arm.  With reps and experience he should be able to defend similarly to Mullins this year.  
    • Name Year 0ft 5ft 10ft 15ft 20ft 25ft 30ft 35ft 40ft 45ft 50ft 55ft 60ft 65ft 70ft 75ft 80ft 85ft 90ft Cowser, Colton - LHB 2023 0.00 0.53 0.82 1.08 1.31 1.52 1.72 1.92 2.11 2.30 2.48 2.66 2.83 3.01 3.18 3.36 3.53 3.72 3.91 Mullins, Cedric - LHB 2023 0.00 0.54 0.84 1.09 1.32 1.53 1.73 1.93 2.12 2.30 2.48 2.66 2.83 3.00 3.18 3.35 3.52 3.71 3.90 Colton Cowser Stats: Statcast, Visuals & Advanced Metrics | baseballsavant.com (mlb.com)
  • Popular Contributors

  • Popular Now

×
×
  • Create New...