Jump to content

Silent James speaks his Mind


weams

Recommended Posts

All I know is, we're 2/3 of the way through the season and we're sustaining the only thing that really matters....WINS!

All this Pythagorean expectations is fine and dandy, but if on October 1 we are in the playoffs and still have a -50 run differential, then I think everybody will agree its a meaningless stat!

Though I would look for serious upgrades in the offseason! :D

Link to comment
Share on other sites

  • Replies 101
  • Created
  • Last Reply
And my above post notwithstanding, unfortunately, this isn't good logic. You can perform this exercise with any team. "Hey, if you take out our ten worst games, look how much better we seem!"

You need to go with the biggest sample size you can find. And it is cherry-picking to cut out our worst stretch. It would be just as bad a cherry-pick if I argued that we're really a .350 team if you take out our best 23 games in a row. And the fact that you can do either really demonstrates how facile the argument is.

Dude, this isn't taking out our ten wost games over the course of the year. This is a trend if the season. Look at the chart, that hole, this set of 23 games in a row (roughly 1/5 of our season) has a distinct beginning and ending.

And the numbers during that time are so astronomically bad! The thing is, it gets hidden because it goes from mid june and mid july and those bad numbers sort of end up tainting both month's worth of numbers.

You may say its cherry picking, but a negative 72 run crater I. An otherwise above average season, to paraphrase Winston Zedemore : that's a big cherry.

Sent from my DROID BIONIC using Tapatalk 2

Link to comment
Share on other sites

Also. Yes you can take a team's best run out. My argument is that that 23 game run that sunk the differential was unsustainable.

On the other end of the spectrum, remember when the Yankees were winning like EVERY game and built up a 10 game lead on the orioles, no one questioned the sustainability of that, yet here they are coming back to the pack a bit now.

For one fifth of the schedule this team really really stunk. It has tainted the numbers for the year but it was unsustainable.

Sent from my DROID BIONIC using Tapatalk 2

Link to comment
Share on other sites

Also. Yes you can take a team's best run out. My argument is that that 23 game run that sunk the differential was unsustainable.

On the other end of the spectrum, remember when the Yankees were winning like EVERY game and built up a 10 game lead on the orioles, no one questioned the sustainability of that, yet here they are coming back to the pack a bit now.

For one fifth of the schedule this team really really stunk. It has tainted the numbers for the year but it was unsustainable.

Sent from my DROID BIONIC using Tapatalk 2

Of course that 23 game run was unsustainable. Isn't that the definition of an extreme sample, or an outlier in a batch of data?

Of course when the Yankees were winning every damn game that wasn't sustainable, either. Again: extreme data points in a larger sample.

But there's an English Channel sized gap between your attempt to link that 23 game stretch and justify our pythag, or say that it has "tainted the numbers for the year". If you want to prove why our pythag is somehow distinctive from the last 100 years of data, or shouldn't be considered that worthwhile in our case, you're going to have to do...I dunno, like, 100x better than that.

Link to comment
Share on other sites

Could the lack of a true long-man in our bullpen be part of the Run Differential mystery?

If we had someone to stop the bleeding when our starters are giving up a ton of runs (who is stretched out to go 4 or 5 innings) wouldn't that keep our record relatively the same, but our run differential lower?

Also, couldn't having that 1 other specialized arm (instead of a long-man) in our bullpen be helping us in all these close games (especially the extra inning games)?

Link to comment
Share on other sites

Of course that 23 game run was unsustainable. Isn't that the definition of an extreme sample, or an outlier in a batch of data?

Of course when the Yankees were winning every damn game that wasn't sustainable, either. Again: extreme data points in a larger sample.

But there's an English Channel sized gap between your attempt to link that 23 game stretch and justify our pythag, or say that it has "tainted the numbers for the year". If you want to prove why our pythag is somehow distinctive from the last 100 years of data, or shouldn't be considered that worthwhile in our case, you're going to have to do...I dunno, like, 100x better than that.

The best attempt I've seen is leveraged BP performance. That's really about it. Certainly valid perhaps to some extent and not a shock to anybody. Perhaps he could provide some data about teams with great BP's exceeding Pythag. That might be interesting to look at.

Link to comment
Share on other sites

I really feel this is all ridiculous. At some point, long past, you are what you're record says you are. THe O's are a winning team.

This reminds me of how everyone reacted to Tebow last year. He cant QB, he can't do this or that yet all he did was go out and win games and take his team into the second round of the playoffs. Yes, their defense played well. But Yes our bullpen has been lights out.

Anytime a team wins unconventionally, they can't continue it. Pundits shut the He## up and deal with the fact that the O's are a pretty good team who finds ways to win games.

Thank you Buck

Link to comment
Share on other sites

The argument over the statistics is really silly.

Basically, the purpose of every predictive mathematical model (statistical or otherwise) is to consider past occurrences and, using that data, determine what is most likely to happen in the future. The goal is to be as certain as possible without making assumptions or using influences that are not based on past observation. But since future events can never be entirely certain (unless you hold that the universe is deterministic), there is going to be some chance that any given prediction will turn out incorrect.

But again, on the terms of comparing one mathematical model against another, as someone who desires understanding the future with greater certainty, a model which predicts an outcome with a higher level of confidence is superior to a model that predicts the outcome with a lower level of confidence. BUT that doesn't mean that a model which predicts "X" is broken if the opposite of "X" happens.

For example, if a weather forecaster predicts that there's an 80% chance of rain today, and it doesn't rain, is the weather forecaster out of a job for having a useless model? No -- it's just that they're saying that one of two things can happen, and one is more likely than the other (based on some mathematical justification), but it just turns out that, even though prediction favored it raining vs. not raining, the less likely alternative "won".

These occurrences can cause forecasters to try and adjust their predictive model to see if they can incorporate any new information that may have been gained from the recent occurrence. But, regardless of what they do, they're always going to be wrong at some point. The only time they can say anything with 100% certainty is after it's already happened (or is actively in the process of happening).

There is a way to test the strength of the predictive model, but it involves more observation of data. For example, if your weather forecaster says that there's an 80% chance of rain on 100 separate days, and only on two of those days does it actually rain, there might be something wrong with their model... a "normal" distribution would have speculated that out of those 100 separate days on which there was an 80% chance of rain, it should have rained at least 80 times. Since 2 is much less than 80 (on a scale of 0 to 100), the predictive ability of that model is pretty awful.

So what would make this "pythagorean" model wrong?

If the pythagorean model mis-predicted playoff contenders with "discomforting" frequency, it would be a pretty bad model. Another way to put it is: the more counterexamples we find where the actual outcome doesn't match what was predicted, the more we can pile on "Look; that model sucks!" -- as long as we're talking about the ratio of mis-predictions vs. successful predictions. So if there's ever a time where the model is wrong by some very significant margin like 25% or 50% (i.e. it only predicts half of the actual playoff contenders), that would be reason to call it into question.

The problem is, the pythagorean model, or any model that makes such a wide, sweeping prediction, will require many years and a LOT of real data to defeat soundly in the court of statisticians and mathematicians. You only get 30 data points per year, and every year has a lot of changing variables underneath the hood, so you have to stack up a bunch of years.

That said, the pythagorean model is not predicting anything with certainty, and I don't think anyone who holds any stock in it believes that it gives any sort of true certainty.

What I think we'll be able to do at the end of this season is look back on the year and say "the pythagorean model did not predict the outcome of the Orioles' year". That's fine. Taken by itself, it's just one data point, and doesn't sink the whole model. That's what makes this model ridiculously resilient (as well as any other model based on mathematics that takes many years to gather data to prove or disprove): it's going to be a long time before we have enough data to discredit it with hard numbers, since going on historical numbers is less meaningful due to the huge shifts in the balance of power in the game over time.

What this whole discussion makes me conclude is that you really don't want to place any sort of importance on statistical predictions that aren't, say, in the percentile of 99.999999%. I feel pretty good if someone can guarantee me that there's a 99.9999999% chance that the food I'm eating doesn't contain food poisoning. I feel pretty good if someone can guarantee me that there's a 99.9999999% chance that my data stored in the cloud won't get deleted or corrupted during a calendar year.

What Pythagorean and the weather forecaster have in common is that they're nowhere near this level of rigor. They don't even sniff 90%, let alone 99%, to say nothing of 99.9999999%. They're perfectly "valid" statistical models in the vacuous sense that they make a prediction with a greater-than-0 probability of being right, but that doesn't mean I place any value in them, or that I trust their predictions like I trust Merck telling me that there's a 99.999999999% chance that the Sudafed capsule I'm ingesting isn't an overdose or the wrong chemical or something equally bad.

Things like weather forecasts and the Pythagorean model are basically there for entertainment value. They invite discussion and speculation, but really, they're no better than you making it up as you go along. Compared to probability calculations that do hold some value, like the chance of a plane crashing or the chance of a drug being defective, the probability of these "forecasts" being correct is so low that you'd better not put any monetary value or mental anguish into expecting that their predictions will come true.

One final point: a model that claims higher confidence suffers a much more jarring blow to its usefulness/reliability when a counterexample is found, compared to a model that gives a more moderate confidence.

Back to the weatherman vs. Merck example: if the weatherman says that there's a 80% chance of rain, and it doesn't rain, this is not very surprising to us, because there was that whopping 20% chance that it wouldn't rain. But if Merck says my pill is defect-free with a 1-in-100-billion chance of it having a defect, imagine my surprise if I take a sudafed and keel over from overdose. If that starts happening much more often than once in every 100 billion times someone takes a sudafed, that company is going to have some big problems and a lot of questions to answer.

Now, just how confident can we be about a baseball model that predicts whether or not a baseball team will end up in the playoffs?

Well, if that model is based on empirical data, then even if it models all past data over the past 100 years with 100% accuracy, the most certain it can be is that out of a sample size of, say, 30 * 100 teams (3000 teams), it was right every time. That means we've observed that we "rolled the dice", per say, 3000 times, and we were right every time. But 3000 samples only puts us at 99.999%, or "1 in every 10,000" as far as the chance of the model being wrong.

And since the pythagorean model hasn't always been right in the history of baseball, we can start subtracting certainty for each time that it was wrong. And then we can subtract some more for those seasons where we were really comparing "apples and oranges"; i.e., the data has something disharmonious about it (live ball era, dead ball era, short season, etc) which makes it too much of an outlier to consider it as valid data. In the end, I'd be surprised if it's even 99% accurate. And even then, a 1% chance of the playoffs is something we can all relate to: it's definitely something worth rooting for. ;)

Link to comment
Share on other sites

Of course that 23 game run was unsustainable. Isn't that the definition of an extreme sample, or an outlier in a batch of data?

Of course when the Yankees were winning every damn game that wasn't sustainable, either. Again: extreme data points in a larger sample.

But there's an English Channel sized gap between your attempt to link that 23 game stretch and justify our pythag, or say that it has "tainted the numbers for the year". If you want to prove why our pythag is somehow distinctive from the last 100 years of data, or shouldn't be considered that worthwhile in our case, you're going to have to do...I dunno, like, 100x better than that.

Did I say anything about that? When did I say that pythag was wrong or that 100 years of data are wrong? All im saying is that this 23 game stretch, characterized by our terrible blowout record, is the reason for our crappy run differential and it is not the result of a season-long trend of underperformance.

Sent from my DROID BIONIC using Tapatalk 2

Link to comment
Share on other sites

Could the lack of a true long-man in our bullpen be part of the Run Differential mystery?

If we had someone to stop the bleeding when our starters are giving up a ton of runs (who is stretched out to go 4 or 5 innings) wouldn't that keep our record relatively the same, but our run differential lower?

Also, couldn't having that 1 other specialized arm (instead of a long-man) in our bullpen be helping us in all these close games (especially the extra inning games)?

I was wondering something sort of similar. Is the gulf between the pitchers who have performed well this season (Chen, Hammels, Gonzalez, Tillman, Strop, O'Day, Patton, Johnson, Ayala, Lindstrom) and the pitchers who have performed horribly (Arrieta, Matusz, Britton, Hunter) historically atypical? If so, it might explain the unprecedented Win / Loss vs. Run Differential numbers.

Link to comment
Share on other sites

The idea that the run differential became an issue only in mid-June is fiction, btw. This post was from May 23 and prompted a long discussion where both Drungo and I said that looking at pythag over such a small sample was "borderline useless". Funny how things switch. As of 5/23, the O's were already outperforming their pythag by 5 wins.

I was just looking at the predicted wins for the Orioles. Check out http://www.baseballprospectus.com/standings/ for example. It looks like the O's have won 4.4 games more than the Pythagorean method predicts and 5+ games more than some other methods predict.

Is this something to be concerned about? I vaguely remember discussions of the Pythagorean based predictions on the board and I seem to recall that the formula is pretty good at predicting wins. Maybe typical variation is 3-4 wins from estimated and actual wins at the end of the season? I know some of you are much more experienced with this tool and am wondering what you think. Have the O's been "lucky"?

By the way, the formula predicts that the O's "should" have a .536 winning pct and, of course, that's still a fantastic performance for the O's.

Link to comment
Share on other sites

Did I say anything about that? When did I say that pythag was wrong or that 100 years of data are wrong? All im saying is that this 23 game stretch, characterized by our terrible blowout record, is the reason for our crappy run differential and it is not the result of a season-long trend of underperformance.

Sent from my DROID BIONIC using Tapatalk 2

Okay, apologies. It's just that that doesn't have much explanatory value, and it doesn't really do anything to say why our pythag isn't a decent measurement of the overall talent of this team or demonstrate how much pretty extreme luck has played a factor in our season. As SrMeowMeow pointed out, of course there's a reason why we've performed our pythag by so much, or else it wouldn't have happened. That is, the point is pretty axiomatic.

But I'm genuinely sorry if I made a straw man (the phrase "made a straw man" might as well mean "made a faux paus" here on the OH :D) out of you, it wasn't my intention.

Alaniee: I think you have some interesting points in your point, but I also think there's some serious issues. For one, let's be clear here: no one is using our pythag (at least they shouldn't be) to project out to the end of the season and predict our end of season record. As I've said in other posts:

"I'd be very interested to see our pythag since we removed Matusz + Arrieta from the rotation and inserted Gonzalez and Tillman. That, for me, would tell me a bit more about what kind of team this is going into next year.

...let's be careful about making things too simplistic. If we finish, say, 83-79 with a pythag of 73-89, I'd say we're closer to being that 73 win team than that 83 win team, but that doesn't mean we're a 73 win team...it does mean we're not an 83 win team--which is an important awareness so to not repeat the mistakes of the cautionary tale that is the Seattle Mariners--but I also don't think you accurately characterize the talent on this team by calling it a 73 win team. "

"Your point that our pythag is a bit harsh on us, though, is well-taken. I think--and your analysis seems to lend credence to this--that it is a bit harsh but not that harsh. In truth if you played this season over 5-10 times we'd probably be closer to our pythag on average than to our actual record (I stay away from saying the record we'd deserve would be closer to our pythag on average than..., because I think that that the idea of "deserving" or of "merit" in general is an inadequate construct--I can't help getting a bit cogitational! I generally think the construct of what is deserved is one of the most useless and distracting, well, really psychological traps that humanity falls into all-too-often and all-too-assuredly...as Felicia on the Wire says: "Deserve ain't got nuthin to do wit' it." --for discussing the relevance of pythag; what pythag tells us is not the record we deserve--no one's saying we don't deserve our current record (or if they are, they're probably using loaded terms they probably couldn't properly define if pressed to do so)--but rather the record that we would most likely happen under more normal conditions of chance, or, as I've kind of gone through up here, the avg. record we might have over a larger sample of say 162x5 games."

I recommend reading CmonOs's analysis here because for me it does the best job I've seen so far of making the pythag discussion a little more nuanced as it needs to be, without still losing the force of pythag, and why it is important/useful.

Link to comment
Share on other sites

The argument over the statistics is really silly.

Basically, the purpose of every predictive mathematical model (statistical or otherwise) is to consider past occurrences and, using that data, determine what is most likely to happen in the future. The goal is to be as certain as possible without making assumptions or using influences that are not based on past observation. But since future events can never be entirely certain (unless you hold that the universe is deterministic), there is going to be some chance that any given prediction will turn out incorrect.

....

Didn't want to quote the entire thing, but this was a great post.

One thing we can predict with certainty when viewing Pythag vs. real world results, is that the model will routinely not accurately predict the results of 1 or 2 teams per year within the accepted levels of uncertainty. This just proves that Pythag is not a flawless model and that there are still things it fails to account for.

One of the errors we make on a routine basis is generally lumping together all results that fall outside of the accepted levels of uncertainty as being the result of "luck". While luck may be one factor, it's almost certainly not the only factor in these results - there are other things going on that affect the results in ways we simply don't have effective tools to measure. Attributing these un-measurable factors to luck is easy, so that's what people tend to do.

What we really should be trying to do, is determine whether there is something tangible that this team does better than any other team. I heard in an ESPN podcast one of the analysts say "the Orioles are better at winning close games than any other team in baseball". I thought that was an interesting way to describe why they've had success so far and it seems to suggest that winning close games is a skill or ability and not just a factor of being lucky. The question then becomes is there a way to quantify the ability to win close games in some measurable way that is clearly different for teams that don't have the ability to constantly win close games?

Personally I no clue whether this is possible, but I'm intrigued by the concept.

Link to comment
Share on other sites

So the big question is, why have we won so many one-run games?

The Factor That Dare Not Speak Its Name in a room full of sabrematicians.

If anyone mentions "luck", I will mix a concoction of eye of newt, toe of frog, wool of bat, and tongue of dog during a full moon ritual, and cast a voodoo hex upon you that will make you turn limp at all the strategic moments of your lives.

Link to comment
Share on other sites

I'm a math major, teach high school math, and wholeheartedly believe in the power (and problems) with statistics. That being said, the big question of 'why do the orioles win' might be answered without using stats or math at all.

Could it be that they're so good at 1-run games, or so great in extra innings, or keep winning so many games because for once they actually believe in themselves? They finally believe their winners? People probably go to the Yankees and expect to win. I doubt many people who went to the O's in the last 15 years truly expected to win. Now that they are, and the mentality is spreading like wildfire. Those bullpen arms get called on with a lead and 25 players think the game's over. They reach the 10th inning and they've already won it in their mind. They're down 7-2 and have felt what its like to come back and win. They don't give up.

Perhaps I'm overselling the mental aspect of baseball a bit too much, but I truly believe that for the first time in many of these players careers, they think they're on a winning team and its motivating the hell out of them to find a way to win.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.


×
×
  • Create New...