Jump to content

Fangraphs article on why their projections hate us


Hallas

Recommended Posts

9 minutes ago, dystopia said:

I didn’t misrepresent anything. You’re trolling. 

If you didn't purposefully misrepresent what he was saying you have a very poor understanding of what is being discussed.

I was trying to give you the benefit of the doubt.

But hey, if you'd rather own up to it being ignorance...

Link to comment
Share on other sites

11 minutes ago, dystopia said:

Why wouldn’t it? I certainly wouldn’t want to throw out past data. 

I answered this, but you're just spouting nonsense. Why would you weight last year's 18 innings more highly than all of the other data we have put together? That's the only way you get a 10.00 ERA projection from a guy with a 2.11 this year and a 3.16 in the minors.

Link to comment
Share on other sites

7 minutes ago, Hallas said:

I mean, are you going to value older and smaller sample size data more strongly than you're going to value recent and larger sample-size data?

I’m not. Others here will. If they want to be consistent in their position in the Ryan O’Hearn thread, that is. 

Link to comment
Share on other sites

5 minutes ago, DrungoHazewood said:

I don't think you have the slightest idea what you're talking about. Either that or you're wildly exaggerating to try to make the point that the nerds are stupid.

If I had a system, which I don't, it would weight Cano's 2023 performance pretty heavily since that's the best data we have. It would account for his 18 bad innings last year, but weighted even less than 18 innings because the data is a year stale. I would leaven the projection with a bit of his 3.16 minor league ERA. I would throw in a little bit of the fact that he's out-performed his FIP. And in the end I'm sure I'd come up with a healthy 2024 line of something like 60 innings of a 3.00.

Cano would not have lowered his ERA to 3.00 in past years if given a full season. More data probably wouldn’t have helped Cano’s numbers much. I think we can safely say that.

The Orioles worked with him before OD this year and turned him into a good pitcher. 

Link to comment
Share on other sites

5 minutes ago, dystopia said:

Why wouldn’t it? I certainly wouldn’t want to throw out past data. 

ALL data is important.  Old, new, recent, ancient, etc.  The issue is how to weigh recent versus historical.  I mean think about it.  What was the probability that Mateo, after having a blistering April where he looked like a new hitter and a MVP candidate, would revert back to his 'historical' normal bad hitter?  Obviously the chance was greater than zero as that's what happened.  Same with O'Hearn.  I love how he's hit this year.  But this year could be an exception, not the role, and there is a fairly reasonable chance that next year he's back to his 'old' self.  Sure, there is a reasonable chance that this change is permanent and that he's changed to a good hitter.  But either outcome is close to 100% guaranteed.  Point being, data is all important, but each has to weigh the current/recent with the older/past to help determine future probabilities.  How a person weighs that brings up interesting discussions about players like Mateo, where some look at his overall year .OPS while others only see the poor last 3-4 months, discounting the hot start as too old now to be relevant.  

As to fangraphs, I see two problems as to the Orioles.  First, many of the guys we have relied upon this year don't have good historical numbers, either due to being mediocre in the past (Hicks, Frazier, etc) or having a small track record in MLB (Gunnar, Adley, etc).  That calls into question their performance this year and the models have a tendency to expect players to revert back to their historical norms.  Second, sometimes a team is greater than the sum of their parts.  No, I'm not really talking about 'clutch' as I don't buy into the fairy dust, but other variables.  For example, a team of replacement level players should win about 50 games.  The O's total WAR this year is about 40 WAR, which means by WAR we should have won about 90 games.  Yet here we are with 101 wins.  Similar to our pythagorean that gives us an expected record of 95-67.  Point is sometimes a team simply 'clicks', has tons of breaks to go their way and end up out performing all expectations and analytics and stats.  It's the little things that happen over 162 game season that fall in our favor.  That sort of stuff is hard to quantify and to put into a projection system.

Link to comment
Share on other sites

11 minutes ago, DrungoHazewood said:

 

If I had a system, which I don't, it would weight Cano's 2023 performance pretty heavily since that's the best data we have. It would account for his 18 bad innings last year, but weighted even less than 18 innings because the data is a year stale. I would leaven the projection with a bit of his 3.16 minor league ERA. I would throw in a little bit of the fact that he's out-performed his FIP. And in the end I'm sure I'd come up with a healthy 2024 line of something like 60 innings of a 3.00.

It will be interesting to see what ZiPS, PECOTA, etc. say about Cano next year.  His ERA was 0.00 for his first 17 outings, then 3.00 for the rest of the year.  He became pretty vulnerable to lefties after a while and Hyde responded by not exposing him to lefties as much down the stretch.  He was used extremely heavily in the first two months and I think fatigue played some role in his declining numbers.  I think 3.00 is a good starting point for a guess for next year, but neither 2.00 nor 4.00 would surprise me much.  

  • Upvote 1
Link to comment
Share on other sites

I think another thing that might have us so low in the playoff percentage on Fangraphs is through August we were fourth in the league in strikeouts. During the month of September, we were 4th from the bottom, the strikeout had disappeared and that is one of the more troubling shifts that happened imo.

Edit: I'm talking about our bullpen.

Edited by Malike
  • Upvote 1
Link to comment
Share on other sites

21 minutes ago, dystopia said:

I’m not. Others here will. If they want to be consistent in their position in the Ryan O’Hearn thread, that is. 

With Ryan O'Hearn you'd probably be right to question his performance early in the season, considering he has almost 1100 plate appearances worth of past data, but obviously as the season goes on and he keeps hitting, you can change your mind.

 

In Texas Holdem, if you have a runner draw on on the flop, you don't count on hitting that, but if you hit the first half of your draw with the next card it alters your decision making.

Edited by Hallas
Link to comment
Share on other sites

33 minutes ago, DrungoHazewood said:

When do you declare the change to be real? ...

In O'Hearn's case, and really everybody, I'd do something like a 4-3-2-1 weighting of his last four seasons. ...

The problem with more specialized models over general are that you often over-think yourself and start believing what's not real.

I think the larger point is that we're dealing with predictive models. Be definition, they're assigning probabilities to unknowns. Some will perform great one day/season and terrible the next, with no methodological change at all. I do think you're generalizing specialized models a bit as if they're generally worse. The entire point of a specialized model is controlling for more known variables. Now that's often done poorly, but it doesn't necessarily mean they'll be worse. In theory, they should be better if properly constructed. The problem is with people who think they're properly constructing any of these models, but actually don't understand the data like they think.

This year the models have poorly represented the O's. If you bet the money line every game, you'd be rich. Next year it could be the opposite, or there really could be an inherent flaw in their methods as it relates to this group of people. I don't know.

 

Link to comment
Share on other sites

From the ALDS preview.

Quote

It should be fun to see these two excited fan bases and particularly the Orioles’ youngsters in a playoff setting for the first time. By contrast to our general Playoff Odds, which have undersold Baltimore all year — just ask any Orioles fan — and give them just a 46% chance of winning this series, our ZiPS game-by-game odds with the presumptive pitching matchups give them a 56% chance. I tend to think the series will tilt that way.

https://blogs.fangraphs.com/alds-preview-baltimore-orioles-vs-texas-rangers/

Link to comment
Share on other sites

1 hour ago, dystopia said:

Cano would not have lowered his ERA to 3.00 in past years if given a full season. More data probably wouldn’t have helped Cano’s numbers much. I think we can safely say that.

The Orioles worked with him before OD this year and turned him into a good pitcher. 

So his 3.16 ERA in the minors... smoke and mirrors? CGI?

Link to comment
Share on other sites

5 minutes ago, DrungoHazewood said:

So his 3.16 ERA in the minors... smoke and mirrors? CGI?

He asked Jobu nicely in the minors.  But he left his Jobu doll on the minor league bus when he was called up to the big club in Minnesota, and Jobu punished him severely.

  • Haha 1
Link to comment
Share on other sites

1 hour ago, Hallas said:

With Ryan O'Hearn you'd probably be right to question his performance early in the season, considering he has almost 1100 plate appearances worth of past data, but obviously as the season goes on and he keeps hitting, you can change your mind.

 

In Texas Holdem, if you have a runner draw on on the flop, you don't count on hitting that, but if you hit the first half of your draw with the next card it alters your decision making.

I refer back to this page a lot. It's a Fangraphs article that lists stabilization points for various statistics. As Tom Tango has reminded me, "stabilization point" means that the signal is as big as the noise, that talent has just passed random variation as the driver.

Strikeouts, walks and power reach meaningfulness pretty quickly. You can start putting some weight on K rate after a few weeks, walks after maybe a month, HRs maybe a couple months.

What is driving O'Hearn's improvements? His HR rate is pretty close to his career mark, so not that. His K rate is down a little, and that's probably real. His walk rate is actually the lowest of his career, and in 368 PAs we can probably trust that. Actually his LD, GB and FB% are very much in line with the rest of his career. 

So what is it? I go back to the shift/no shift splits. He's actually not having his best MLB season if you throw out the PAs where he was shifted on. When there was no shift against him he actually hit better in 2018 and 2021 than this year. So I think his much higher BABIP is a pretty direct result of the shift ban. That might be pretty sticky, since it looks like the shift isn't coming back any time soon. But BABIP also takes 800+ PAs to reach significance, so we'll have to wait and see what next year brings.

Link to comment
Share on other sites

2 hours ago, Can_of_corn said:

I don't get the Berrios move hate.  I mean I get it, the move didn't work so folks are pointing at it but lets not act like he was dominating out there.  Didn't he allow at least one base runner each inning?

The Jays didn't score in the game.  Does anyone think Berrios was going to throw a CG shutout?

To me the big error in the game was Vlad's baserunning.

Exactly.  They scored zero runs.  It doesn’t matter who you pitch, you can’t win scoring zero runs.  Vladdy being picked off was symbolic of their entire season. 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.




×
×
  • Create New...