Jump to content

Fangraphs article on why their projections hate us


Hallas

Recommended Posts

34 minutes ago, dystopia said:

Fair post. If nothing else all FG and the Blue Jays debacle has proved is that just because you’re a nerd who can put together fancy equations and type numbers into a computer doesn’t mean you should be anywhere near the sport of baseball. 

Did you know that calling someone a nerd doesn't really make it all that much more likely that Ogre is going to come beat up Booger and the gang and make them take back their geeky math-talk?

Link to comment
Share on other sites

33 minutes ago, dystopia said:

Fair post. If nothing else all FG and the Blue Jays debacle has proved is that just because you’re a nerd who can put together fancy equations and type numbers into a computer doesn’t mean you should be anywhere near the sport of baseball. 

I actually *kinda* agree with this. Sports analytics are quite flawed, though baseball's are best as far as I can tell. But there's always context that "data nerds" cannot have because they're not on the inside. Now I don't agree that you can conclude they shouldn't be anywhere near the sport, it's just that you need to know what their information tells you and appreciate what it cannot. 

Earlier in this thread  @DrungoHazewood made a crack about Ryan O'Hearn's transformation relative to the previous 5 years and implied that you'd be stupid to conclude that this is the real O'Hearn but the last 5 years aren't. I think that's a very debatable point because some people really do transform. Look at Matt Olson from the Braves for example. He's not just having a good year, he's totally changed his swing this year. O'Hearn has too. Rodriguez isn't the same guy (by stuff, pitch mix, or experience) as he was before he was sent down this year. This stuff is hard to account for across an entire league, and history. So yeah, generalized models have a hard time accounting for all of this stuff. That doesn't mean it's not legitimate stuff, and it doesn't mean the models are biased, it's just that all should be taken for what it's worth and not oversold as some be all end all data conclusion.

 

Link to comment
Share on other sites

Just now, LookinUp said:

I actually *kinda* agree with this. Sports analytics are quite flawed, though baseball's are best as far as I can tell. But there's always context that "data nerds" cannot have because they're not on the inside. Now I don't agree that you can conclude they shouldn't be anywhere near the sport, it's just that you need to know what their information tells you and appreciate what it cannot. 

Earlier in this thread  @DrungoHazewood made a crack about Ryan O'Hearn's transformation relative to the previous 5 years and implied that you'd be stupid to conclude that this is the real O'Hearn but the last 5 years aren't. I think that's a very debatable point because some people really do transform. Look at Matt Olson from the Braves for example. He's not just having a good year, he's totally changed his swing this year. O'Hearn has too. Rodriguez isn't the same guy (by stuff, pitch mix, or experience) as he was before he was sent down this year. This stuff is hard to account for across an entire league, and history. So yeah, generalized models have a hard time accounting for all of this stuff. That doesn't mean it's not legitimate stuff, and it doesn't mean the models are biased, it's just that all should be taken for what it's worth and not oversold as some be all end all data conclusion.

 

Some do transform, but isn't it sensible to wait until you have a larger data set?

How good did Mateo look early in the season?  I'm pretty sure I read a piece about how he worked with the coaches and this time the change was real.

Guys can have a whole season worth of outlier data.

  • Upvote 2
Link to comment
Share on other sites

6 minutes ago, LookinUp said:

Earlier in this thread  @DrungoHazewood made a crack about Ryan O'Hearn's transformation relative to the previous 5 years and implied that you'd be stupid to conclude that this is the real O'Hearn but the last 5 years aren't. I think that's a very debatable point because some people really do transform. 

I don't believe that he said what you said he said.  He said you'd be wrong not to consider what O'Hearn was for the last 5 years, just as you'd be wrong to ignore what he's done this year.  You might even be wrong not to consider what he's done for the last few weeks when he didn't hit all that well.  What matters is how much weight you give to any of these periods of time.

Edited by NCRaven
  • Upvote 2
Link to comment
Share on other sites

3 minutes ago, LookinUp said:

I actually *kinda* agree with this. Sports analytics are quite flawed, though baseball's are best as far as I can tell. But there's always context that "data nerds" cannot have because they're not on the inside. Now I don't agree that you can conclude they shouldn't be anywhere near the sport, it's just that you need to know what their information tells you and appreciate what it cannot. 

Earlier in this thread  @DrungoHazewood made a crack about Ryan O'Hearn's transformation relative to the previous 5 years and implied that you'd be stupid to conclude that this is the real O'Hearn but the last 5 years aren't. I think that's a very debatable point because some people really do transform. Look at Matt Olson from the Braves for example. He's not just having a good year, he's totally changed his swing this year. O'Hearn has too. Rodriguez isn't the same guy (by stuff, pitch mix, or experience) as he was before he was sent down this year. This stuff is hard to account for across an entire league, and history. So yeah, generalized models have a hard time accounting for all of this stuff. That doesn't mean it's not legitimate stuff, and it doesn't mean the models are biased, it's just that all should be taken for what it's worth and not oversold as some be all end all data conclusion.

 

Using Drungo’s logic, Yennier Cano will put up a 10.00+ ERA next year. 
 

Also, I didn’t say all number crunchers/nerds need to go. I was just saying that there’s a lot of folks in baseball who are “experts” at statistics and math but have no clue when it comes to the actual game of baseball. 
 

Not all are like that, but they’re definitely some, and whoever made that decision to yank Berrios on Wednesday is a prime example. 

Link to comment
Share on other sites

1 minute ago, dystopia said:

Using Drungo’s logic, Yennier Cano will put up a 10.00+ ERA next year. 
 

Also, I didn’t say all number crunchers/nerds need to go. I was just saying that there’s a lot of folks in baseball who are “experts” at statistics and math but have no clue when it comes to the actual game of baseball. 
 

Not all are like that, but they’re definitely some, and whoever made that decision to yank Berrios on Wednesday is a prime example. 

I don't get the Berrios move hate.  I mean I get it, the move didn't work so folks are pointing at it but lets not act like he was dominating out there.  Didn't he allow at least one base runner each inning?

The Jays didn't score in the game.  Does anyone think Berrios was going to throw a CG shutout?

To me the big error in the game was Vlad's baserunning.

Link to comment
Share on other sites

1 minute ago, Can_of_corn said:

I don't get the Berrios move hate.  I mean I get it, the move didn't work so folks are pointing at it but lets not act like he was dominating out there.  Didn't he allow at least one base runner each inning?

The Jays didn't score in the game.  Does anyone think Berrios was going to throw a CG shutout?

To me the big error in the game was Vlad's baserunning.

No surprise there. 

Link to comment
Share on other sites

3 minutes ago, dystopia said:

Why wouldn’t it? I certainly wouldn’t want to throw out past data. 

I mean, are you going to value older and smaller sample size data more strongly than you're going to value recent and larger sample-size data?

Link to comment
Share on other sites

16 minutes ago, LookinUp said:

I actually *kinda* agree with this. Sports analytics are quite flawed, though baseball's are best as far as I can tell. But there's always context that "data nerds" cannot have because they're not on the inside. Now I don't agree that you can conclude they shouldn't be anywhere near the sport, it's just that you need to know what their information tells you and appreciate what it cannot. 

Earlier in this thread  @DrungoHazewood made a crack about Ryan O'Hearn's transformation relative to the previous 5 years and implied that you'd be stupid to conclude that this is the real O'Hearn but the last 5 years aren't. I think that's a very debatable point because some people really do transform. Look at Matt Olson from the Braves for example. He's not just having a good year, he's totally changed his swing this year. O'Hearn has too. Rodriguez isn't the same guy (by stuff, pitch mix, or experience) as he was before he was sent down this year. This stuff is hard to account for across an entire league, and history. So yeah, generalized models have a hard time accounting for all of this stuff. That doesn't mean it's not legitimate stuff, and it doesn't mean the models are biased, it's just that all should be taken for what it's worth and not oversold as some be all end all data conclusion.

 

When do you declare the change to be real? As Corn said, a lot of folks here were declaring Mateo's April to be really real. Like, what are we going to do with like three legit MLB shortstops? Clearly gotta trade somebody and soon!

In O'Hearn's case, and really everybody, I'd do something like a 4-3-2-1 weighting of his last four seasons. If I miss a little bit because of that on the rare step change in real performance I'm perfectly fine with that because 95% of the time I'm much more right than wrong.

The problem with more specialized models over general are that you often over-think yourself and start believing what's not real. Prospectus has (had?) a model that was based on similarity scores instead of just straight past performance/age/regression, and in the end it was worse than a simple model like Marcels because it tended to think, say, Adam Dunn and Dave Kingman were similar and if Kingman had a bad year at 33 it throw Dunn's age 33 projection into chaos. Or take the stock market - you get all kinds of people who claim their individually-tailored investment experts will beat the market and give you a leg up, but most of the time a simple diversified mutual fund will do as well, or better.

In baseball tailoring every projection based on more detailed knowledge may occasionally get you a big win, but more often will trick you into thinking that random variation coupled with a plausible narrative is the secret sauce.

  • Upvote 1
Link to comment
Share on other sites

15 minutes ago, dystopia said:

Using Drungo’s logic, Yennier Cano will put up a 10.00+ ERA next year. 
 

Also, I didn’t say all number crunchers/nerds need to go. I was just saying that there’s a lot of folks in baseball who are “experts” at statistics and math but have no clue when it comes to the actual game of baseball. 
 

Not all are like that, but they’re definitely some, and whoever made that decision to yank Berrios on Wednesday is a prime example. 

I don't think you have the slightest idea what you're talking about. Either that or you're wildly exaggerating to try to make the point that the nerds are stupid.

If I had a system, which I don't, it would weight Cano's 2023 performance pretty heavily since that's the best data we have. It would account for his 18 bad innings last year, but weighted even less than 18 innings because the data is a year stale. I would leaven the projection with a bit of his 3.16 minor league ERA. I would throw in a little bit of the fact that he's out-performed his FIP. And in the end I'm sure I'd come up with a healthy 2024 line of something like 60 innings of a 3.00.

  • Haha 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.




×
×
  • Create New...