Sunday, March 20, 2011

DMB Pitcher Success and the Correlation Coefficient

I am sure I did this wrong, perhaps some of the math geeks in the league can point me in the right direction.

I wanted to find the correlation coefficient between the 100 sim resultant ERA's for our pitcher set versus various real life stats. The idea was to see which real life stat I should focus on to rank my players. The higher the number, the more confidence we have that the two sets of data are strongly related. In our case, the more confidence we would have that a great real life strikeout rate (for instance) is related to a great DMB ERA.

My application of this is extremely rudimentary and by no means can we truly peg the single real life stat that the DMB engine uses (it likely uses many). However, the results are logical in both common sense and personal experience so I think they hit the mark.

-0.31: K's Per 9 Innings
-0.31: K/BB Ratio
-0.10: HR/9

The correlation coefficient is a number between -1 and 1. The lower the number, the less the two data sets are related. The higher the number, the more they are related. This says that K's, K/BB, and HR rates have very little to do with DMB success (as determined by the 100 Sim ERA's).

This makes sense to me, as over the years I have used strikeout rates less and less in determining the order of my draft board. DMB seems to not care how a pitcher gets a batter out. Of course, a better defense would be critical for the success of non strikeout pitchers.

Home Run rate was a bit surprising, but, my instinct tells me this is negative relationship is most likely due to my laziness. The real life home run rates, in fact all the stats in this mini-study, are not era adjusted. So many of the dead ball era pitchers have minuscule home run rates which must throw the correlation equation out of whack. We have dead ball era pitchers that give up no home runs with great ERA's and current pitchers who give up a lot of home runs with great ERA's too.

0.19 - ERA

This was the entire goal of my post, to get the point across that owners should not look at the all time leaders in real life ERA and think this is a good indication of ATB success. It plainly isn't, with a correlation coefficient under 0.2.

0.51 - WHIP
Whoa! The mother load. Real life WHIP is highly related to DMB success. So much so, that one could argue all other stats in comparison on meaningless.

Breaking WHIP down into its underlying components - hit rate and walk rate - we get the following:
  • H/9 innings has a 0.47 coefficient
  • BB/9 has a 0.04 coefficient.
I have no idea if this is true or not, but seems like a positive sign that the two numbers add up to the original 0.51 WHIP coefficient.

The moral of the story? Spend as much time as possible focusing on WHIP.

No comments:

Post a Comment