Jump to content

Book Review: The Book by Tango, Lichtman, Dolphin


KAZ97

Recommended Posts

Upon the recommendation of Drungo, I picked up a copy of The Book by Tango, Lichtman and Dolphin (Drungo, per your request, here are my thoughts). I figured we can use this thread to discuss the book for those who have read it or are thinking of getting a copy.

First, its clear from the appendix that the authors collectively have the statistical chops to write this book. I applaud their efforts of communicating some rather technical aspects of statistical analysis in non-technical language. I'm sure we can agree or disagree with respect to how well they accomplished that goal.

At the risk of sounding like Frank Costanza on Festivus Night (Tango, Lichtman, Dolphin, I've got alot of problems with you people...) one aspect of the book has particularly stuck in my craw. Its right up front in the Toolbox and it permeates all the presented analyzes. In their own words, with respect to expected variance in observed performances ...

There are two factors that influence performance. One is how good a player really is; the other is how lucky (or unlucky) he has been. (page 50)

Let's postpone the discussion of "luck" initially and think about where that leaves us. Without luck, the only factor that influences performance is the players true talent? Clearly that is neither correct nor (and I'm guessing here) what the authors intended. Say we talk about Nick Markakis and we somehow know that his true talent is as a .360 OBP player. If we take Nick out of the major leagues and put him on my beer league softball team, you would certainly expect his performance to be better than .360. That might be an extreme, but the point is that alot of things influence performance, including but not limited to the talent of the opposition, the position of the fielders, the tendencies of the opposing manager, maybe the time of day, maybe how much he had to drink the night before. The variation in performance is driven by more than just Nick's true talent, even in the absence of any "luck".

So now let's talk about "luck". I think the authors would agree with the above point about there being many different influences of performance, or atleast they spend plenty of lip service talking about context. So what about the left over uncertainty? Should we just chalk that up to "luck"? To paraphrase Einstein "God does not play dice". There are two schools of thought here: (a) we can treat uncertainty as "luck", as the authors do or (b) we can use it as a measure of the amount of true causal influence that is unknown or unspecified. The point here is not whether modern day particle physics has demonstrated an example of stochastic determination in nature (they have), the point is is that determinism as a cognitive framework lays the foundation for valid statistical inference. And heck, if its good enough for Einstein to use to figure out the theory of relativity, it should be good enough for us to make some estimates about Nick Markakis' OBP.

Ok, pretty esoteric nit picking here, right? I don't think so. Combine a questionable framework of the true causal influences of observed performance with a couple of other invalid assumption and we are well on the road to making some dangerous conclusions. Invalid assumptions like ....

Thus, estimating true talent or skill is essentially the same as making a projection. (page 50)

As a stand alone statement, that is pretty dangerous. This seems to be another example of the authors not taking their own advice about "context". Note here that the authors are not talking about variance in performance, but rather that if we could ever estimate the true talent, that is equivalent to a prediction of future performance (or atleast that's how I read their statement). That is based on a rather large, unstated assumption. The assumption being that we know all of the causal determinants and that they are the same in the future environment we are predicting (see the importance of the esoteric discussion of "luck" above?). We know that at least one determinant will change to invalidate that assumption: time. The second sample will by definition be later, so both the batter and pitcher will be a different age (perhaps effecting performance) and both will have acquired experience (again effecting performance). Maybe the opposing managerial tendencies have changed, or maybe the batter went on a bender the night before the second sample. To be kind, we can say that true talent is the same as making a prediction given some serious assumptions about the differences in the true, perhaps unknown, causal determinants.

Now we do have a problem. With these two missteps up front we start talking about the sample size necessary to make estimates "real". I still don't understand how a player can go 3 for 4 and it somehow isn't "real". This confusing (and misleading) language can lead to some "real" errors.

Link to comment
Share on other sites

Ok, its there, the elephant in the room. Let's put it on the table and talk about it: Regression to the Mean

By most accounts, the phenomenon the authors describe as Regression to the Mean is not the same phenomenon generally attributed to that term, for example as explained here, here or here. The fundamental conclusion drawn from the concept of regression to the mean is when designing a study of some aspect of performance, you should not attempt to draw valid conclusion from a sample selected from the extremes. Yet how many times did the authors do this in the book? They were continually showing tables of the top 10 ten batters per this metric or that metric and then demonstrating that upon sampling them again (next game after a hot streak, next season etc) they performed as normally expected. That. That is regression to the mean.

What they describe, adding a certain number of "average" performances in an attempt to make estimates more "real", is something else entirely. First, note that this (in my opinion, misleading) suggestion comes directly from the avoidance of a deterministic causal framework. Instead of saying that a guy went 12 for 20 over some stretch because favorable causal determinants were present, the authors chalk it up to "luck" and thus try to remove this "luck" and make the estimate "real". In reality, the authors are not heading the advice of Regression to the Mean, but rather are performing a flavor of Bayesian inference. In brief, here's the logic behind Bayesian thinking. One starts with a stated prior belief. In our case, absent of any information, the authors believe the performance to be "average". One then makes a series of observations such as observing 100 PA and uses that information to form a new belief, generally called a posterior. The authors call their posterior beliefs "real".

The fact that the authors require such large sample sizes to make "real" estimates is merely a statement of the strength of their prior beliefs. In essence they are saying that we believe so strongly that a performance is average that we will have to see alot of evidence to the contrary in order to sway us.

Needless to say, this is not the only approach to statistical inference.

Link to comment
Share on other sites

Ok, its there, the elephant in the room. Let's put it on the table and talk about it: Regression to the Mean

By most accounts, the phenomenon the authors describe as Regression to the Mean is not the same phenomenon generally attributed to that term, for example as explained here, here or here. The fundamental conclusion drawn from the concept of regression to the mean is when designing a study of some aspect of performance, you should not attempt to draw valid conclusion from a sample selected from the extremes. Yet how many times did the authors do this in the book? They were continually showing tables of the top 10 ten batters per this metric or that metric and then demonstrating that upon sampling them again (next game after a hot streak, next season etc) they performed as normally expected. That. That is regression to the mean.

What they describe, adding a certain number of "average" performances in an attempt to make estimates more "real", is something else entirely. First, note that this (in my opinion, misleading) suggestion comes directly from the avoidance of a deterministic causal framework. Instead of saying that a guy went 12 for 20 over some stretch because favorable causal determinants were present, the authors chalk it up to "luck" and thus try to remove this "luck" and make the estimate "real". In reality, the authors are not heading the advice of Regression to the Mean, but rather are performing a flavor of Bayesian inference. In brief, here's the logic behind Bayesian thinking. One starts with a stated prior belief. In our case, absent of any information, the authors believe the performance to be "average". One then makes a series of observations such as observing 100 PA and uses that information to form a new belief, generally called a posterior. The authors call their posterior beliefs "real".

The fact that the authors require such large sample sizes to make "real" estimates is merely a statement of the strength of their prior beliefs. In essence they are saying that we believe so strongly that a performance is average that we will have to see alot of evidence to the contrary in order to sway us.

Needless to say, this is not the only approach to statistical inference.

I don't see why this is a bad application of Bayesian inference. Considering that baseball outcomes in small samples fluctuate due to luck, it probably should take a large sample size to overturn the a priori expected value as measured over large sample sizes.

Link to comment
Share on other sites

I don't see why this is a bad application of Bayesian inference. Considering that baseball outcomes in small samples fluctuate due to luck, it probably should take a large sample size to overturn the a priori expected value as measured over large sample sizes.

I wouldn't necessarily call it bad, and certainly alot of very smart people rely on similar techniques for inference (I'm just not one of them and you can use that as a prior for your own beliefs about me). But I think you stated my objection for me. If we consider fluctuation is due to luck, then I might even agree with you. But by using this framework its very easy, for example, to make claims that there is no such thing "real" splits and from there is not a far leap to the conclusion that no player consistently performs better in the second half than the first.

An alternative, and in my opinion more productive, approach is to attempt to identify the unknown determinants of performance rather than chalking them up to luck. And who knows, maybe season platoon splits is one of the influences of performance for a subset of players.

Link to comment
Share on other sites

I wouldn't necessarily call it bad, and certainly alot of very smart people rely on similar techniques for inference (I'm just not one of them and you can use that as a prior for your own beliefs about me). But I think you stated my objection for me. If we consider fluctuation is due to luck, then I might even agree with you. But by using this framework its very easy, for example, to make claims that there is no such thing "real" splits and from there is not a far leap to the conclusion that no player consistently performs better in the second half than the first.

An alternative, and in my opinion more productive, approach is to attempt to identify the unknown determinants of performance rather than chalking them up to luck. And who knows, maybe season platoon splits is one of the influences of performance for a subset of players.

You are right that we do not know all of the determinants of performance, but we do know that stochastic processes play a large role in baseball (and all other sample spaces). I don't think that sabermatricians are actively trying to ignore those unknown determinants; in contrast, I'm sure there are people paid very handsomely by sports teams to search very hard for these types of influences.

But as in all applied statistics, it is important to develop a null hypothesis and determine what the expected range of deviation might be due to chance alone. The human brain is prone to jump to conclusions about causality, often times irrationally and/or in contrast to probabilistic evidence.

Rather than saying that unknown determinants do not exist, the assumption is that, much like luck, they even out over time. If they didn't, then one would expect there to be a statistical artifact somewhere within a sufficiently large sample size. Rather than saying that statistical projections are perfect simulations of human behavior, the authors are, in my opinion, suggesting that the data can be analyzed sufficiently to be of some use to future predictions.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...