Wednesday, September 12, 2012

Not Even the 100 Resims Tell the Whole Story


The 100 and 79 Resims changed drafting strategies forever.  This is nothing new, we learned this the hard way during our draft last spring.  A common complaint among owners, including myself, is that we often fell into what I called the "79 Resim Trap".  Those resims made many players looked better (or worse) than reality.

Perhaps the best example was Larry Doby.  Averaging 539 plate appearances across the 79 resims, Doby was the 6th best center fielder after scoring a .423 on base percentage and .799 OPS.  During the season he was a ruin, batting .258 with a .325 on base.  The after season 20 re-sim results where much the same:  .247 / .325 / .345.

Of course there are numerous plausible explanations for the difference.  Park factors for both Doby's home team and the league, general differences in quality of play in the 79 resims vs last season (due to 56 teams in the resims and 24 in ATB XIV), etc.  Each of these contribute to the difference in results of every player in the game.

However, part of my thinking always worried about the utter consistency in the 79 resim process.  With injuries turned off and the same schedule used repeatedly over and over, there are bound to be small but measurable differences in the results.  Pitchers especially, due to the fact they took the mound every 5 days against the same team, each season for 79 seasons.  Many pitchers wouldn't get to face all teams in the league with a schedule such as this.

Earlier this week, I mentioned my goal of conducting another batch of resims.  We have many new negro league players, a host of database corrections, and new ATB era to deal with (the era may end up changing things quite a bit).

For this new round of resims I decided to run at least 3 batches of 100 resims.  The difference in each batch is the team schedule, for which I'll shuffle so that they differ between each.  Random injuries will also be turned on so the rotations and lineups are not 100% predictable from one day to the next; over the course of 300 seasons the random variation from missing half of a season with a broken leg would be minimized.

Some early results are in and they are surprising.

  • Andrew Bailey tossed over 5000 innings in each batch (A and B, I haven't run C) and compiled a 2.87 ERA/1.21 WHIP the first time, and a 2.50 ERA/1.14 WHIP the second.


  • Noodles Hahn pitched in over 15,000 innings in each batch, and his stats went from 3.60/1.20 to 3.83/1.27


  • Walker Cooper saw his OPS drop from .700 to .664 over teh course 15,000 at bats in each batch


  • With at least 17,000 at bats in each batch, Gavy Cravath hit a respectable .261 / .313 / .444 in one batch, and an almost undraftable .250 / .303 / .420 in another

The schedule won't account for all of the variation.  100 seasons is a heck of a lot of data but I expect random variance would account for 1-2% difference, but I am confident that our previous resim iterations didn't give a clear view of each player and in fact, may had led some of us astray.

No comments:

Post a Comment