Jump to content

Data driven decisions can be garbage


Baltimorecuse

Recommended Posts

It's really a no brainer.  In a game like baseball no sext can possibly consider every possible variable.  You see this in political polling all the time.  

 

Example in baseball.  You've got 1st and 3rd with one out.  The percentages in the book say hit away, because back checking all the times the computer can find this situation back as far as the records go, hitting away, ON THE AVERAGE, produces more runs.  But that's an average of every time this situation comes up.  

So let's add some different variables.  The pitcher kills right handed hitting.  The hitter can't hit righties.  It's the bottom of the eighth.  A run doubles your score.  Now every time that situation has occurred is averaged into the original set of variables with all the thousands of time the first and third, with one out situation has occurred.  But that totally waters down the specific situation we're in and the computer book decision is useless.  

The problem with the data is it's overly generalized.  For you Ravens fans, the book says you go fourth and one.  How did that work out last year?

Link to comment
Share on other sites

2 minutes ago, ArtVanDelay said:

Is there a purpose to this thread or did you just feel the need to crap on analytics?

And I’m not trying to be a jerk here.  This thread just seems totally out of left field.  The O’s just got a big win to avoid a sweep and this is what’s on your mind?

Link to comment
Share on other sites

10 minutes ago, Baltimorecuse said:

It's really a no brainer.  In a game like baseball no sext can possibly consider every possible variable.  You see this in political polling all the time.  

 

Example in baseball.  You've got 1st and 3rd with one out.  The percentages in the book say hit away, because back checking all the times the computer can find this situation back as far as the records go, hitting away, ON THE AVERAGE, produces more runs.  But that's an average of every time this situation comes up.  

So let's add some different variables.  The pitcher kills right handed hitting.  The hitter can't hit righties.  It's the bottom of the eighth.  A run doubles your score.  Now every time that situation has occurred is averaged into the original set of variables with all the thousands of time the first and third, with one out situation has occurred.  But that totally waters down the specific situation we're in and the computer book decision is useless.  

The problem with the data is it's overly generalized.  For you Ravens fans, the book says you go fourth and one.  How did that work out last year?

You seem to be condemning the notion of data driven decisions... by suggesting that you could make better decisions if you had more data.

What am I missing?

Link to comment
Share on other sites

1 minute ago, ArtVanDelay said:

Is there a purpose to this thread or did you just feel the need to crap on analytics?

Yes, when McKenna failed to get the bunt down a couple of games ago, someone jumped in and said the analytics mean you should hit away in that situation.  McKenna ended up hitting into a DP, that allowed the other side to tie the game with one run.  

I argued the analytics in specific situations are crap.  Never got around to explaining why.  

Link to comment
Share on other sites

1 minute ago, owknows said:

You seem to be condemning the notion of data driven decisions... by suggesting that you could make better decisions if you had more data.

What am I missing?

Actually it's more than that.  Sometimes incomplete data is worse than no data at all because it leads to the wrong decision.    Remember "New Coke"?

Link to comment
Share on other sites

Just now, Baltimorecuse said:

Actually it's more than that.  Sometimes incomplete data is worse than no data at all because it leads to the wrong decision.    Remember "New Coke"?

So your beef isn't with data driven decisions... It's with insufficient data.

Link to comment
Share on other sites

1 minute ago, Baltimorecuse said:

Yes, when McKenna failed to get the bunt down a couple of games ago, someone jumped in and said the analytics mean you should hit away in that situation.  McKenna ended up hitting into a DP, that allowed the other side to tie the game with one run. I argued the analytics in specific situations are crap.  Never got around to explaining why.  

OK, BUT, the analytics and algorithms are only as good as the data and with specific situations like you described, the dataset is minimal, well below SSS. We are only on the cusp of where data-driven will go in the near future. Hyde's gut decisions, however, based on the generalities in the big datasets the Sig-bot chews on combined with his own experiences is a whole nother can of worms. Probably belongs in the Hyde bashing thread.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...