How OOTP 2007 Models Real Baseball

Elendil · 06-02-2007, 02:14 PM

What follows is a comparison of OOTP and MLB statistical output in certain fundamental areas of the game engine. I was going to do a blog on this during the beta process but never had the time. I've re-run the sim using patch 2.

I ran a 32-year sim on the default MLB-style quickstart, using regular MLB rules except that trading was turned off to speed up sim time. I exported the stats for the last 11 seasons to compare how OOTP models baseball compared to real-life MLB over the 1995-2005 period.

The first area of comparison is correlations among different aspects of hitter performance. The goal of this comparison is to figure out whether OOTP correctly models real-life hitter types. For instance, if OOTP tends to create fast power hitters with high BABIP and slow contact hitters with low BABIP, that would be wrong. In real life, power hitters tend to strike out and walk a lot, and are usually slow. Fast guys tend to hit fewer homeruns and have higher BABIP.

So the first thing I've done is to get Pearson correlations on a bunch of hitting stats for 1995-2005 MLB, each player-season being a single "observation," to use statistical language. I've only included player-seasons with at least 400 plate appearances, to eliminate "cup of coffee" players whose performances will vary widely because of the sample size problem.

Here's a guide to the abbreviations of the individual statistical variables:

avg=batting average
babip=batting average on balls in play
kab=strikeouts per at-bat
hrab=home runs per at-bat
bbpa=walks per plate appearance
doab=doubles per at-bat
trab=triples per at-bat
hppa=hit by pitches per plate appearance
sbattg=stolen base attempts per game played

Here's the table of MLB correlations. A variable's correlation with itself is of course 1. The number underneath each correlation is the statistical significance level. Numbers under 0.05 indicate that the correlation is statistically significant from zero at the 95% confidence level.

Code:

         |      avg    babip      kab     hrab     bbpa     doab     trab
----------+---------------------------------------------------------------
      avg |   1.0000 
          |
          |
    babip |   0.7792   1.0000 
          |   0.0000
          |
      kab |  -0.3503   0.0820   1.0000 
          |   0.0000   0.0001
          |
     hrab |   0.2330  -0.0094   0.4258   1.0000 
          |   0.0000   0.6511   0.0000
          |
     bbpa |   0.1252   0.0966   0.2980   0.4521   1.0000 
          |   0.0000   0.0000   0.0000   0.0000
          |
     doab |   0.4015   0.3271  -0.0304   0.2185   0.1182   1.0000 
          |   0.0000   0.0000   0.1443   0.0000   0.0000
          |
     trab |   0.0706   0.1696  -0.1271  -0.3049  -0.1337  -0.1517   1.0000 
          |   0.0007   0.0000   0.0000   0.0000   0.0000   0.0000
          |
     hppa |   0.0028   0.0232   0.0842   0.0776  -0.0105   0.0789  -0.0525 
          |   0.8910   0.2645   0.0000   0.0002   0.6139   0.0001   0.0116
          |
   sbattg |   0.0820   0.1633  -0.1497  -0.3107  -0.0310  -0.2057   0.4291 
          |   0.0001   0.0000   0.0000   0.0000   0.1364   0.0000   0.0000
          |

          |     hppa   sbattg
----------+------------------
     hppa |   1.0000 
          |
          |
   sbattg |  -0.0161   1.0000 
          |   0.4379

So we find some interesting things here. Of course batting average is strongly positively associated with BABIP, and negatively associated with strikeouts per at-bat (failure to put balls in play). Nothing surprising there.

More interesting findings:

1) Stolen base attempts, a good proxy for speed, is modestly positively associated with BABIP (fast guys get more hits on balls in play, because they can beat out throws more easily), modestly negatively associated with strikeouts, fairly strongly negatively associated with home runs, modestly negatively associated with doubles, and strongly positively associated with triples.

2) Home run power is strongly positively associated with strikeouts and walks (thus, those two are correlated as well, more weakly). Think Ken Phelps. There's also some positive correlation between HR's and doubles.

3) BABIP is somewhat positively associated with doubles, and more weakly, triples. Doubles hitters have apparently learned how to hit it where they ain't.

4) Hit by pitches really aren't very correlated with anything else.

Conclusion: There are, roughly speaking, two types of hitter: slow power hitters who strike out and walk a lot, and fast contact hitters who have high BABIP, presumably mostly with singles, don't strike out or walk much, and hit triples. Of course, most players will not fit clearly into either type, because these are just general tendencies, not strict categories.

How does OOTP compare to real life? Here's the same table based on the years 2028-2038 from the OOTP sim described earlier, same plate appearance threshold:

Code:

          |      avg    babip      kab     hrab     bbpa     doab     trab
----------+---------------------------------------------------------------
      avg |   1.0000 
          |
          |
    babip |   0.8141   1.0000 
          |   0.0000
          |
      kab |  -0.3858   0.0592   1.0000 
          |   0.0000   0.0035
          |
     hrab |   0.1661  -0.2119   0.0615   1.0000 
          |   0.0000   0.0000   0.0024
          |
     bbpa |  -0.0294  -0.0488   0.0081   0.0607   1.0000 
          |   0.1466   0.0160   0.6901   0.0027
          |
     doab |   0.2617   0.2737  -0.1103  -0.1076   0.0049   1.0000 
          |   0.0000   0.0000   0.0000   0.0000   0.8077
          |
     trab |   0.0935   0.1453  -0.0553  -0.1743  -0.1080   0.1405   1.0000 
          |   0.0000   0.0000   0.0063   0.0000   0.0000   0.0000
          |
    hbppa |   0.0162   0.0201   0.0149   0.0089  -0.1190   0.0277   0.0263 
          |   0.4243   0.3224   0.4635   0.6602   0.0000   0.1712   0.1936
          |
   sbattg |   0.0560   0.1407  -0.0125  -0.2044  -0.0551  -0.0864   0.4296 
          |   0.0057   0.0000   0.5376   0.0000   0.0065   0.0000   0.0000
          |

          |    hbppa   sbattg
----------+------------------
    hbppa |   1.0000 
          |
          |
   sbattg |   0.0764   1.0000 
          |   0.0002

OOTP gets some things basically perfect. The relationship of batting average to BABIP and strikeouts is uncanny. Speed also has the expected relationships with BABIP, homers, and triples. The big deficiency is with the correlation of HR's to walks and strikeouts. This is way too small in OOTP. Also, doubles and triples are modestly positively correlated in OOTP because of the Gap Power concept, but in MLB doubles and triples are negatively correlated. Also, doubles and homers are essentially uncorrelated in OOTP, when they need to be positively correlated.

How to fix these issues? Player creation seems to be the place. You could set up a player creation algorithm that makes high-power guys also have good eye and bad avoid k's (and lower-power guys have bad eye and good avoid k's). Markus did say that he would work on this for 2008; it's a complex issue.

Next up: correlations among pitcher skills, OOTP vs. MLB.

Elendil · 06-02-2007, 02:30 PM

So now I do the same with pitchers that I just did with hitters. Here are the statistical categories:

babip=Batting Average on Balls in Play Allowed
hrip=home runs allowed per inning pitched
bbip=walks per inning pitched
hbpip=hit by pitches per inning pitched
wpip=wild pitches per inning pitched
kip=strikeouts per inning pitched
gbpct=ground ball percentage (for OOTP sim only)

I included only pitchers with at least 162 innings pitched in both the MLB and OOTP samples. Here are the MLB correlations:

Code:

         |    babip     hrip     bbip    hbpip     wpip      kip
----------+------------------------------------------------------
    babip |   1.0000 
          |
          |
     hrip |   0.0197   1.0000 
          |   0.5459
          |
     bbip |   0.0156   0.0404   1.0000 
          |   0.6327   0.2146
          |
    hbpip |   0.0100   0.0229   0.2002   1.0000 
          |   0.7576   0.4811   0.0000
          |
     wpip |   0.0603   0.0021   0.4357   0.1389   1.0000 
          |   0.0635   0.9480   0.0000   0.0000
          |
      kip |   0.0186  -0.2812   0.0719   0.0633   0.1631   1.0000 
          |   0.5685   0.0000   0.0269   0.0513   0.0000

Only a few aspects of pitcher performance are correlated with each other. Strikeouts are negatively correlated with HR's allowed. Hit by pitches and especially wild pitches are positively correlated with walks (and thus weakly correlated with each other as well). Strikeouts are modestly positively correlated with wild pitches ("effectively wild" pitchers?).

Here's how OOTP compares:

Code:

          |    babip     hrip     bbip    hbpip     wpip      kip    gbpct
----------+---------------------------------------------------------------
    babip |   1.0000 
          |
          |
     hrip |   0.0317   1.0000 
          |   0.3559
          |
     bbip |   0.1080  -0.0396   1.0000 
          |   0.0016   0.2491
          |
    hbpip |   0.0728   0.1053   0.1267   1.0000 
          |   0.0340   0.0021   0.0002
          |
     wpip |   0.1409   0.0695   0.2197   0.1270   1.0000 
          |   0.0000   0.0431   0.0000   0.0002
          |
      kip |   0.1581   0.0359   0.1018   0.0098  -0.0257   1.0000 
          |   0.0000   0.2956   0.0030   0.7747   0.4548
          |
    gbpct |   0.1278  -0.3808  -0.0336  -0.0565  -0.0187   0.1236   1.0000 
          |   0.0002   0.0000   0.3279   0.0999   0.5858   0.0003

For some reason, wilder pitchers (more K's, BB's, WP's, and HBP's) seem to allow a higher BABIP, although this may be a statistical artifact (despite the significance levels). Strikeouts, unfortunately, are not a causal factor in home runs allowed. It just makes sense that if a hitter can't make contact, he can't hit it over the fence. But OOTP doesn't model this yet. On the other hand, control is modeled pretty well, with walks correlating pretty strongly with wild pitches (but could be stronger) and less strongly with HBP's. Strikeouts don't correlate with WP's the way they're supposed to, but this is a small matter, because the real-life correlation is pretty small.

The main area to work on seems to be getting high-stuff pitchers to allow fewer HR's. During the beta process, Ronco, Markus, and others discussed changing the "sequence of play" in OOTP. I think Ronco could give a better account of that conversation than I could, because I didn't follow it closely, but I believe OOTP does not determine whether a ball has been successfully put into play before determining whether it is a home run, and Ronco argued that this should be done - it would probably resolve the issue. But there may be a workaround that does the same thing, as Markus argued. Something to consider for '08.

AirmenSmith · 06-02-2007, 02:34 PM

I'm sure this is interesting but my attention is to short to read it all

Elendil · 06-02-2007, 02:45 PM

Next up, I look at hitter and then pitcher consistency over time. This is a complex area that is affected by both the game engine and player development. For example, if we find that in MLB, pitcher BABIP allowed varies a lot from year to year, in that previous year's BABIP has low predictive value for this year's BABIP, then we could conclude one of two things: 1) Controlling BABIP is a skill, but it varies wildly from year to year (lots of "talent hits" and "boosts"); 2) Most pitchers can't control BABIP at all; it's controlled by hitters and fielders (the DIPS model).

OOTP models DIPS fairly accurately to account for the BABIP finding. In other areas, OOTP uses player development to get (close to) the right year-to-year consistency in hitters and pitchers. This has caused some recent consternation on the forum, but in a lot of areas, OOTP is accurately modeling MLB consistency levels - or even attributing too much consistency to players.

So what I do now is to take those hitters with at least 400 plate appearances from 1995-2005 MLB and regress their performance this year on their performance last year. The coefficient on last year's performance indicates how well that variable predicts this year's performance. We can do this for both MLB and OOTP, then do something called "Wald tests" to see whether the OOTP coefficients are statistically similar to the MLB coefficients. If we can reject at the 99% confidence level that the coefficients are the same, then we can be that confident that OOTP is not modeling MLB as it existed from 1995-2005. If we cannot reject those tests, then it is possible that OOTP is accurately modeling MLB from 1995-2005.

Here are the MLB figures (the "l" at the end of the name of an independent variable indicates that it is last year's value):

Batting average:

Code:

  Source |       SS       df       MS                  Number of obs =    1552
---------+------------------------------               F(  1,  1550) =  416.39
   Model |  .262794238     1  .262794238               Prob > F      =  0.0000
Residual |  .978241836  1550  .000631124               R-squared     =  0.2118
---------+------------------------------               Adj R-squared =  0.2112
   Total |  1.24103607  1551  .000800152               Root MSE      =  .02512

------------------------------------------------------------------------------
     avg |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
    avgl |   .4654056   .0228077     20.406   0.000       .4206684    .5101427
   _cons |   .1487189   .0064867     22.927   0.000       .1359954    .1614425
------------------------------------------------------------------------------

BABIP:

Code:

  Source |       SS       df       MS                  Number of obs =    1552
---------+------------------------------               F(  1,  1550) =  268.05
   Model |  .205897946     1  .205897946               Prob > F      =  0.0000
Residual |  1.19058718  1550  .000768121               R-squared     =  0.1474
---------+------------------------------               Adj R-squared =  0.1469
   Total |  1.39648512  1551  .000900377               Root MSE      =  .02771

------------------------------------------------------------------------------
   babip |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
  babipl |   .3812329   .0232852     16.372   0.000       .3355592    .4269066
   _cons |   .1889099   .0072598     26.021   0.000       .1746699    .2031499
------------------------------------------------------------------------------

Strikeouts:

Code:

  Source |       SS       df       MS                  Number of obs =    1552
---------+------------------------------               F(  1,  1550) = 4162.68
   Model |  3.69607201     1  3.69607201               Prob > F      =  0.0000
Residual |  1.37625702  1550  .000887908               R-squared     =  0.7287
---------+------------------------------               Adj R-squared =  0.7285
   Total |  5.07232903  1551   .00327036               Root MSE      =   .0298

------------------------------------------------------------------------------
     kab |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
    kabl |   .8700271   .0134849     64.519   0.000       .8435766    .8964776
   _cons |   .0219328   .0024468      8.964   0.000       .0171333    .0267323
------------------------------------------------------------------------------

Home runs:

Code:

  Source |       SS       df       MS                  Number of obs =    1552
---------+------------------------------               F(  1,  1550) = 2479.21
   Model |  .434785083     1  .434785083               Prob > F      =  0.0000
Residual |  .271827476  1550  .000175373               R-squared     =  0.6153
---------+------------------------------               Adj R-squared =  0.6151
   Total |  .706612559  1551  .000455585               Root MSE      =  .01324

------------------------------------------------------------------------------
    hrab |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
   hrabl |   .7860592    .015787     49.792   0.000       .7550932    .8170253
   _cons |   .0077772   .0006912     11.252   0.000       .0064215    .0091329
------------------------------------------------------------------------------

Walks:

Code:

  Source |       SS       df       MS                  Number of obs =    1552
---------+------------------------------               F(  1,  1550) = 2736.22
   Model |  1.34499246     1  1.34499246               Prob > F      =  0.0000
Residual |  .761904306  1550  .000491551               R-squared     =  0.6384
---------+------------------------------               Adj R-squared =  0.6381
   Total |  2.10689676  1551  .001358412               Root MSE      =  .02217

------------------------------------------------------------------------------
    bbpa |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
   bbpal |   .8047551   .0153847     52.309   0.000       .7745781     .834932
   _cons |   .0195353   .0015831     12.340   0.000       .0164301    .0226404
------------------------------------------------------------------------------

Doubles:

Code:

  Source |       SS       df       MS                  Number of obs =    1552
---------+------------------------------               F(  1,  1550) =  229.04
   Model |  .034167622     1  .034167622               Prob > F      =  0.0000
Residual |  .231227293  1550  .000149179               R-squared     =  0.1287
---------+------------------------------               Adj R-squared =  0.1282
   Total |  .265394915  1551  .000171112               Root MSE      =  .01221

------------------------------------------------------------------------------
    doab |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
   doabl |   .3581552   .0236656     15.134   0.000       .3117352    .4045752
   _cons |   .0359054   .0013683     26.241   0.000       .0332214    .0385893
------------------------------------------------------------------------------

Triples:

Code:

  Source |       SS       df       MS                  Number of obs =    1552
---------+------------------------------               F(  1,  1550) =  459.39
   Model |  .007917137     1  .007917137               Prob > F      =  0.0000
Residual |  .026712629  1550  .000017234               R-squared     =  0.2286
---------+------------------------------               Adj R-squared =  0.2281
   Total |  .034629766  1551  .000022327               Root MSE      =  .00415

------------------------------------------------------------------------------
    trab |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
   trabl |   .4575906   .0213494     21.433   0.000       .4157138    .4994673
   _cons |   .0027058   .0001639     16.509   0.000       .0023843    .0030273
------------------------------------------------------------------------------

Hit by pitches:

Code:

  Source |       SS       df       MS                  Number of obs =    1552
---------+------------------------------               F(  1,  1550) = 1148.05
   Model |  .036916585     1  .036916585               Prob > F      =  0.0000
Residual |  .049841623  1550  .000032156               R-squared     =  0.4255
---------+------------------------------               Adj R-squared =  0.4251
   Total |  .086758208  1551  .000055937               Root MSE      =  .00567

------------------------------------------------------------------------------
    hppa |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
   hppal |   .6538325   .0192968     33.883   0.000       .6159818    .6916831
   _cons |   .0032995   .0002278     14.485   0.000       .0028527    .0037463
------------------------------------------------------------------------------

So hitters are pretty consistent in all these areas, especially so in walks, strikeouts, homers, and hit by pitches.

The OOTP figures now:

Batting average:

Code:

  Source |       SS       df       MS                  Number of obs =    1401
---------+------------------------------               F(  1,  1399) =  540.07
   Model |  .362675521     1  .362675521               Prob > F      =  0.0000
Residual |  .939475576  1399  .000671534               R-squared     =  0.2785
---------+------------------------------               Adj R-squared =  0.2780
   Total |   1.3021511  1400  .000930108               Root MSE      =  .02591

------------------------------------------------------------------------------
     avg |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
    avgl |   .5367331   .0230958     23.239   0.000       .4914269    .5820392
   _cons |    .129984   .0065874     19.732   0.000       .1170616    .1429063
------------------------------------------------------------------------------

. test avgl=.465

 ( 1)  avgl = .465

       F(  1,  1399) =    9.65
            Prob > F =    0.0019

Hitters are slightly too consistent in average in OOTP.

BABIP:

Code:

  Source |       SS       df       MS                  Number of obs =    1401
---------+------------------------------               F(  1,  1399) =  641.58
   Model |  .528494232     1  .528494232               Prob > F      =  0.0000
Residual |   1.1524156  1399  .000823742               R-squared     =  0.3144
---------+------------------------------               Adj R-squared =  0.3139
   Total |  1.68090983  1400   .00120065               Root MSE      =   .0287

------------------------------------------------------------------------------
   babip |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
  babipl |   .5572015   .0219982     25.329   0.000       .5140484    .6003546
   _cons |   .1355438    .006887     19.681   0.000       .1220338    .1490537
------------------------------------------------------------------------------

. test babipl=.381

 ( 1)  babipl = .381

       F(  1,  1399) =   64.16
            Prob > F =    0.0000

Hitters are definitely too consistent in BABIP, which probably accounts for all of the over-consistency in average. There should actually be more talent changes in BABIP (and hence Contact) from year to year - or perhaps fielders should be given more control of ball in play outcomes.

Strikeouts:

Code:

  Source |       SS       df       MS                  Number of obs =    1401
---------+------------------------------               F(  1,  1399) = 3272.80
   Model |  2.10330394     1  2.10330394               Prob > F      =  0.0000
Residual |  .899083785  1399  .000642662               R-squared     =  0.7005
---------+------------------------------               Adj R-squared =  0.7003
   Total |  3.00238773  1400  .002144563               Root MSE      =  .02535

------------------------------------------------------------------------------
     kab |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
    kabl |   .8351947   .0145992     57.208   0.000       .8065561    .8638333
   _cons |   .0259216   .0024843     10.434   0.000       .0210481     .030795
------------------------------------------------------------------------------

. test kabl=.870

 ( 1)  kabl = .87

       F(  1,  1399) =    5.68
            Prob > F =    0.0173

OOTP is fine here.

Home runs:

Code:

  Source |       SS       df       MS                  Number of obs =    1401
---------+------------------------------               F(  1,  1399) = 2107.96
   Model |  .274643686     1  .274643686               Prob > F      =  0.0000
Residual |  .182273975  1399  .000130289               R-squared     =  0.6011
---------+------------------------------               Adj R-squared =  0.6008
   Total |   .45691766  1400   .00032637               Root MSE      =  .01141

------------------------------------------------------------------------------
    hrab |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
   hrabl |   .7650667   .0166636     45.913   0.000       .7323784    .7977549
   _cons |   .0079987   .0006432     12.437   0.000       .0067371    .0092604
------------------------------------------------------------------------------

. test hrabl=.786

 ( 1)  hrabl = .786

       F(  1,  1399) =    1.58
            Prob > F =    0.2092

OOTP is basically perfect here.

Walks:

Code:

  Source |       SS       df       MS                  Number of obs =    1401
---------+------------------------------               F(  1,  1399) = 4259.64
   Model |  1.53287944     1  1.53287944               Prob > F      =  0.0000
Residual |  .503445604  1399  .000359861               R-squared     =  0.7528
---------+------------------------------               Adj R-squared =  0.7526
   Total |  2.03632504  1400  .001454518               Root MSE      =  .01897

------------------------------------------------------------------------------
    bbpa |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
   bbpal |   .8690225   .0133151     65.266   0.000       .8429027    .8951422
   _cons |   .0161037   .0014988     10.744   0.000       .0131634    .0190439
------------------------------------------------------------------------------

. test bbpal=.805

 ( 1)  bbpal = .805

       F(  1,  1399) =   23.12
            Prob > F =    0.0000

Hitters are too consistent in walks in OOTP, but the difference is not enough to be worried about IMO.

Doubles:

Code:

  Source |       SS       df       MS                  Number of obs =    1401
---------+------------------------------               F(  1,  1399) =  662.35
   Model |  .107429501     1  .107429501               Prob > F      =  0.0000
Residual |  .226909201  1399  .000162194               R-squared     =  0.3213
---------+------------------------------               Adj R-squared =  0.3208
   Total |  .334338703  1400  .000238813               Root MSE      =  .01274

------------------------------------------------------------------------------
    doab |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
   doabl |   .5662438   .0220018     25.736   0.000       .5230837     .609404
   _cons |   .0241442    .001316     18.346   0.000       .0215626    .0267258
------------------------------------------------------------------------------

. test doabl=.358

 ( 1)  doabl = .358

       F(  1,  1399) =   89.58
            Prob > F =    0.0000

Hitters are way too consistent in doubles in OOTP.

Triples:

Code:

  Source |       SS       df       MS                  Number of obs =    1401
---------+------------------------------               F(  1,  1399) =  227.31
   Model |  .004913495     1  .004913495               Prob > F      =  0.0000
Residual |  .030240005  1399  .000021615               R-squared     =  0.1398
---------+------------------------------               Adj R-squared =  0.1392
   Total |    .0351535  1400   .00002511               Root MSE      =  .00465

------------------------------------------------------------------------------
    trab |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
   trabl |   .3715826   .0246458     15.077   0.000        .323236    .4199293
   _cons |   .0044159   .0002173     20.321   0.000       .0039896    .0048422
------------------------------------------------------------------------------

. test trabl=.458

 ( 1)  trabl = .458

       F(  1,  1399) =   12.29
            Prob > F =    0.0005

Hitters are actually not consistent enough in triples in OOTP, but the difference again is probably not enough to worry about.

Hit by pitches:

Code:

  Source |       SS       df       MS                  Number of obs =    1401
---------+------------------------------               F(  1,  1399) = 1184.30
   Model |  .035344824     1  .035344824               Prob > F      =  0.0000
Residual |  .041752572  1399  .000029845               R-squared     =  0.4584
---------+------------------------------               Adj R-squared =  0.4581
   Total |  .077097397  1400   .00005507               Root MSE      =  .00546

------------------------------------------------------------------------------
   hbppa |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
  hbppal |   .6617633   .0192297     34.414   0.000       .6240412    .6994855
   _cons |   .0035469   .0002467     14.375   0.000       .0030628    .0040309
------------------------------------------------------------------------------

. test hbppal=.654

 ( 1)  hbppal = .654

       F(  1,  1399) =    0.16
            Prob > F =    0.6865

OOTP is perfect here.

Elendil · 06-02-2007, 02:58 PM

So now I do the same with pitchers that I just did with hitters. Again, only pitchers with at least 162 innings are considered. First, MLB figures.

Home runs allowed:

Code:

  Source |       SS       df       MS                  Number of obs =     548
---------+------------------------------               F(  1,   546) =  145.49
   Model |  .127441738     1  .127441738               Prob > F      =  0.0000
Residual |  .478278436   546  .000875968               R-squared     =  0.2104
---------+------------------------------               Adj R-squared =  0.2090
   Total |  .605720174   547  .001107349               Root MSE      =   .0296

------------------------------------------------------------------------------
    hrip |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
   hripl |   .4597914   .0381197     12.062   0.000       .3849123    .5346706
   _cons |   .0646603    .004436     14.576   0.000       .0559465    .0733741
------------------------------------------------------------------------------

Walks allowed:

Code:

  Source |       SS       df       MS                  Number of obs =     548
---------+------------------------------               F(  1,   546) =  469.52
   Model |  2.33592822     1  2.33592822               Prob > F      =  0.0000
Residual |  2.71644331   546  .004975171               R-squared     =  0.4623
---------+------------------------------               Adj R-squared =  0.4614
   Total |  5.05237152   547  .009236511               Root MSE      =  .07053

------------------------------------------------------------------------------
    bbip |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
   bbipl |   .6940592    .032031     21.668   0.000         .63114    .7569783
   _cons |   .0939304   .0108658      8.645   0.000       .0725866    .1152742
------------------------------------------------------------------------------

Hit by pitches allowed:

Code:

  Source |       SS       df       MS                  Number of obs =     548
---------+------------------------------               F(  1,   546) =  161.59
   Model |  .044273656     1  .044273656               Prob > F      =  0.0000
Residual |  .149594497   546  .000273983               R-squared     =  0.2284
---------+------------------------------               Adj R-squared =  0.2270
   Total |  .193868153   547  .000354421               Root MSE      =  .01655

------------------------------------------------------------------------------
   hbpip |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
  hbpipl |    .467494    .036776     12.712   0.000       .3952541    .5397338
   _cons |   .0172597   .0013936     12.385   0.000       .0145223    .0199971
------------------------------------------------------------------------------

Wild pitches allowed:

Code:

  Source |       SS       df       MS                  Number of obs =     548
---------+------------------------------               F(  1,   546) =  228.92
   Model |  .055085991     1  .055085991               Prob > F      =  0.0000
Residual |  .131385957   546  .000240634               R-squared     =  0.2954
---------+------------------------------               Adj R-squared =  0.2941
   Total |  .186471949   547  .000340899               Root MSE      =  .01551

------------------------------------------------------------------------------
    wpip |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
   wpipl |   .5641967   .0372896     15.130   0.000        .490948    .6374454
   _cons |   .0117504   .0012279      9.569   0.000       .0093383    .0141625
------------------------------------------------------------------------------

Strikeouts:

Code:

  Source |       SS       df       MS                  Number of obs =     548
---------+------------------------------               F(  1,   546) =  825.57
   Model |  10.4741974     1  10.4741974               Prob > F      =  0.0000
Residual |  6.92724866   546  .012687269               R-squared     =  0.6019
---------+------------------------------               Adj R-squared =  0.6012
   Total |  17.4014461   547  .031812516               Root MSE      =  .11264

------------------------------------------------------------------------------
     kip |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
    kipl |   .7905794    .027515     28.733   0.000       .7365312    .8446276
   _cons |   .1415202   .0208551      6.786   0.000        .100554    .1824863
------------------------------------------------------------------------------

BABIP:

Code:

  Source |       SS       df       MS                  Number of obs =     548
---------+------------------------------               F(  1,   546) =   13.76
   Model |  .005061354     1  .005061354               Prob > F      =  0.0002
Residual |  .200865233   546  .000367885               R-squared     =  0.0246
---------+------------------------------               Adj R-squared =  0.0228
   Total |  .205926587   547  .000376465               Root MSE      =  .01918

------------------------------------------------------------------------------
   babip |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
  babipl |   .1517221   .0409046      3.709   0.000       .0713726    .2320717
   _cons |   .2382242   .0114752     20.760   0.000       .2156833    .2607651
------------------------------------------------------------------------------

Pitchers have some consistency in all these areas, but the real outliers are BABIP (little consistency from year to year) and strikeouts and walks (much consistency from year to year).

Now for the OOTP figures:

Home runs allowed:

Code:

  Source |       SS       df       MS                  Number of obs =     720
---------+------------------------------               F(  1,   718) =  246.81
   Model |  .226501802     1  .226501802               Prob > F      =  0.0000
Residual |  .658909131   718  .000917701               R-squared     =  0.2558
---------+------------------------------               Adj R-squared =  0.2548
   Total |  .885410932   719  .001231448               Root MSE      =  .03029

------------------------------------------------------------------------------
    hrip |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
   hripl |   .5183204   .0329923     15.710   0.000       .4535474    .5830933
   _cons |   .0536609   .0036754     14.600   0.000       .0464452    .0608767
------------------------------------------------------------------------------

. test hripl=.460

 ( 1)  hripl = .46

       F(  1,   718) =    3.12
            Prob > F =    0.0775

OOTP is basically right.

Walks allowed:

Code:

  Source |       SS       df       MS                  Number of obs =     720
---------+------------------------------               F(  1,   718) =  847.50
   Model |   3.2546525     1   3.2546525               Prob > F      =  0.0000
Residual |  2.75734708   718  .003840316               R-squared     =  0.5414
---------+------------------------------               Adj R-squared =  0.5407
   Total |  6.01199958   719  .008361613               Root MSE      =  .06197

------------------------------------------------------------------------------
    bbip |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
   bbipl |   .7173578   .0246415     29.112   0.000       .6689798    .7657358
   _cons |   .0993492   .0092342     10.759   0.000         .08122    .1174784
------------------------------------------------------------------------------

. test bbipl=.694

 ( 1)  bbipl = .694

       F(  1,   718) =    0.90
            Prob > F =    0.3435

OOTP is very good.

HBP's:

Code:

  Source |       SS       df       MS                  Number of obs =     720
---------+------------------------------               F(  1,   718) =  126.32
   Model |  .045867933     1  .045867933               Prob > F      =  0.0000
Residual |  .260707399   718  .000363102               R-squared     =  0.1496
---------+------------------------------               Adj R-squared =  0.1484
   Total |  .306575332   719  .000426391               Root MSE      =  .01906

------------------------------------------------------------------------------
   hbpip |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
  hbpipl |   .3977048   .0353851     11.239   0.000       .3282342    .4671755
   _cons |   .0276887   .0017652     15.686   0.000       .0242231    .0311543
------------------------------------------------------------------------------

. test hbpipl=.467

 ( 1)  hbpipl = .467

       F(  1,   718) =    3.83
            Prob > F =    0.0506

OOTP is close again.

Wild pitches:

Code:

  Source |       SS       df       MS                  Number of obs =     720
---------+------------------------------               F(  1,   718) =   41.02
   Model |  .007585682     1  .007585682               Prob > F      =  0.0000
Residual |  .132780873   718  .000184932               R-squared     =  0.0540
---------+------------------------------               Adj R-squared =  0.0527
   Total |  .140366555   719  .000195225               Root MSE      =   .0136

------------------------------------------------------------------------------
    wpip |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
   wpipl |     .22923   .0357915      6.405   0.000       .1589615    .2994985
   _cons |    .025604   .0013155     19.463   0.000       .0230212    .0281867
------------------------------------------------------------------------------

. test wpipl=.564

 ( 1)  wpipl = .564

       F(  1,   718) =   87.49
            Prob > F =    0.0000

OOTP is way off here. Pitchers need to be much more consistent in the wild pitches they give up from year to year. Perhaps catchers are being given too much control over whether a wild pitch happens? Or do WP ratings actually change too much from year to year?

Strikeouts:

Code:

  Source |       SS       df       MS                  Number of obs =     720
---------+------------------------------               F(  1,   718) = 2927.45
   Model |  22.1810339     1  22.1810339               Prob > F      =  0.0000
Residual |  5.44022095   718  .007576909               R-squared     =  0.8030
---------+------------------------------               Adj R-squared =  0.8028
   Total |  27.6212548   719   .03841621               Root MSE      =  .08705

------------------------------------------------------------------------------
     kip |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
    kipl |   .8885372   .0164222     54.106   0.000        .856296    .9207784
   _cons |   .0647741   .0126715      5.112   0.000       .0398964    .0896518
------------------------------------------------------------------------------

. test kipl=.791

 ( 1)  kipl = .791

       F(  1,   718) =   35.28
            Prob > F =    0.0000

OOTP pitchers are a little too consistent, but probably nothing to worry about.

BABIP:

Code:

  Source |       SS       df       MS                  Number of obs =     720
---------+------------------------------               F(  1,   718) =   58.50
   Model |  .028920706     1  .028920706               Prob > F      =  0.0000
Residual |  .354969459   718  .000494386               R-squared     =  0.0753
---------+------------------------------               Adj R-squared =  0.0740
   Total |  .383890165   719  .000533922               Root MSE      =  .02223

------------------------------------------------------------------------------
   babip |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
  babipl |   .2699635   .0352967      7.648   0.000       .2006665    .3392605
   _cons |   .2183511   .0105917     20.615   0.000       .1975566    .2391455
------------------------------------------------------------------------------

. test babipl=.152

 ( 1)  babipl = .152

       F(  1,   718) =   11.17
            Prob > F =    0.0009

OOTP pitchers are actually too consistent in BABIP.

Overall, then, OOTP again leans toward players' being a little too consistent from year to year. Increasing the number of random talent changes might be a solution, although it might damage the gameplay: no one likes losing a superstar prospect to sudden, severe talent hits. Also, in some areas, OOTP players aren't sufficiently consistent, and that would become a worse problem.

Another solution might be to reduce the talent change modifier, which would make players even more consistent than they should be, but to play with talents on a crude scale, like 1-5 or 2-8, and ratings off completely, so that you have to go by stats to figure out who really is a good player. Of course, the AI will have even more of an advantage because they know the true ratings, and if player development changes are dialed down, those true ratings are even more important for future performance.

Finally, you could increase the injury rate instead of the talent change rate, so that players get more talent changes as a result of injury.

Overall, though, OOTP 2007 does a much better job of modeling real baseball than did OOTP 2006, to say nothing of prior versions. There are still some areas that can be worked on for 2008, but the advances have been tremendous.

Elendil · 06-02-2007, 02:59 PM

Quote:

Originally Posted by AirmenSmith

I'm sure this is interesting but my attention is to short to read it all

Just read the final paragraph of each post.

pstrickert · 06-02-2007, 03:27 PM

Quote:

Originally Posted by Elendil

Overall, though, OOTP 2007 does a much better job of modeling real baseball than did OOTP 2006, to say nothing of prior versions. There are still some areas that can be worked on for 2008, but the advances have been tremendous.

I can't even begin to imagine the time and effort Markus put into OOTP 2007 to achieve these results. Wow.

AirmenSmith · 06-02-2007, 03:34 PM

Quote:

Originally Posted by Elendil

Just read the final paragraph of each post.

Ah so you came prepared for ppl like me

thanks

Eugene Church · 06-02-2007, 04:16 PM

Thank you for sharing your research with us.

Assos · 06-02-2007, 04:28 PM

*Assos is proud to award you a Ph.D in Baseball Stats*

injury log · 06-02-2007, 04:54 PM

Really fascinating stuff. I'm digesting the Pearson correlations now, but I'd love to get a walk-through on one of the Hausman tables- the interweb isn't being very helpful as a stats refresher. Only if you have a chance- I hate to ask, considering all the careful work you've put into preparing and presenting the data.

Elendil · 06-02-2007, 05:26 PM

Quote:

Originally Posted by injury log

Really fascinating stuff. I'm digesting the Pearson correlations now, but I'd love to get a walk-through on one of the Hausman tables- the interweb isn't being very helpful as a stats refresher. Only if you have a chance- I hate to ask, considering all the careful work you've put into preparing and presenting the data.

Well, those tables show simple regressions of the kind y=a+bx, where "y" is "this year's stat," "a" is the constant, "b" is a coefficient, and "x" is "last year's stat." So after I run those regressions for the MLB sample, I run the same regressions on the OOTP sample, and after each one I do a Hausman test to determine whether the coefficient ("b") in each OOTP-sample regression is equal to the coefficient in the corresponding MLB-sample regression. (Hope that's clear enough.)

So the Hausman test itself is actually quite simple. You take the coefficient and its standard error and use a t-statistic to determine whether you can reject, at a particular level of confidence, the null hypothesis that the coefficient is equal to some number (in this case, the coefficient derived from the equivalent MLB-sample regression). So a Hausman test is really just a generalization of the more usual test of statistical significance done in a regression, i.e., that b=0.

RonCo · 06-02-2007, 11:21 PM

Nice work, as always.

injury log · 06-03-2007, 09:22 AM

Quote:

Originally Posted by Elendil

Well, those tables show simple regressions of the kind y=a+bx, where "y" is "this year's stat," "a" is the constant, "b" is a coefficient, and "x" is "last year's stat." So after I run those regressions for the MLB sample, I run the same regressions on the OOTP sample, and after each one I do a Hausman test to determine whether the coefficient ("b") in each OOTP-sample regression is equal to the coefficient in the corresponding MLB-sample regression. (Hope that's clear enough.)

So the Hausman test itself is actually quite simple. You take the coefficient and its standard error and use a t-statistic to determine whether you can reject, at a particular level of confidence, the null hypothesis that the coefficient is equal to some number (in this case, the coefficient derived from the equivalent MLB-sample regression). So a Hausman test is really just a generalization of the more usual test of statistical significance done in a regression, i.e., that b=0.

Thanks very much- I appreciate the explanation. It seems clear, but I need some coffee before I get into the details!

Mattingly06 · 06-03-2007, 10:58 AM

Thanks Eledil, great job. Any chance you can crunch some numbers on minor leaguers? I always wondered how good the development engine is in this game.

1998 Yankees · 06-03-2007, 11:08 AM

Quote:

Originally Posted by AirmenSmith

I'm sure this is interesting but my attention is to short to read it all

LOL. It's like this comment was squashed in between all that data .

1998 Yankees · 06-03-2007, 11:10 AM

Quote:

Originally Posted by RonCo

Nice work, as always.

I'm sure it is, and I'm glad folks are willing to do this extensive analysis to "keep things honest," but I have a sudden urge to log off now and play OOTP.

Mike Donlin · 06-03-2007, 11:44 AM

Thanks for the work. It's good to see OOTP mostly getting these things right.

Syd Thrift · 06-03-2007, 01:11 PM

Quote:

Originally Posted by RonCo

Nice work, as always.

To you as well. I have to say, the attention to detail that you and Elendil showed during the beta process coupled with Markus's ability to take your data and implement it into the game was nothing short of incredible. One of the things that sets this game apart from every baseball career sim ever created - no, every sports career sim ever created - is that stuff like this wasn't guessed at, it was closely studied, gone over with a fine-toothed comb, and re-studied over and over and over again. Is it 100% accurate? No, but man oh man is it close.

There are times I've played other games and even past versions of OOTP where something has happened and I've said "come on, is that something that would REALLY happen?". I don't believe I've ever said that in this version, and that, folks, is a huge part of why this is the best PC game in the history of PC games (okay, metacritic didn't rate Wasteland or Baldur's Gate, but it does go pretty far back).

RonCo · 06-03-2007, 01:24 PM

I should mention that I think Elendil's commentary about needing more talent changes in certain areas is not quite right. Yes, it could be made to work that way, but in order to get less consistency, we need more _ratings_ changes, not talent changes. In the current OOTP model, ratings follow talent, so perhaps that's saying that ratings need to grow or fade faster than they do in the current OOTP model. "Inconsistency" does not specifically mean the variance must necessarily be both positive and negative.

The consternation about talent changes in other areas of the forum are that OOTP talents go down a whole lot, and up almost never. So this dynamic is not being used to induce inconsistency. We worked hard on the basic growing and fading of each skill in v2007, and I think that work is showing well. But v2008 can be made better by addressing the year-to-year consistency question, and I suggest that fine-tuning the growth process at the rating level, and improving the fielding model are the design items that will be most useful in this area. Oscillating talent will not do much to affect year-to-year stats consistency unless the ratings development rate would be made much, much more rapid, which would in turn cause many more issues.

06-02-2007, 02:14 PM	#1
Elendil Hall Of Famer Join Date: Dec 2003 Location: the dynasty forum Posts: 2,318	How OOTP 2007 Models Real Baseball What follows is a comparison of OOTP and MLB statistical output in certain fundamental areas of the game engine. I was going to do a blog on this during the beta process but never had the time. I've re-run the sim using patch 2. I ran a 32-year sim on the default MLB-style quickstart, using regular MLB rules except that trading was turned off to speed up sim time. I exported the stats for the last 11 seasons to compare how OOTP models baseball compared to real-life MLB over the 1995-2005 period. The first area of comparison is correlations among different aspects of hitter performance. The goal of this comparison is to figure out whether OOTP correctly models real-life hitter types. For instance, if OOTP tends to create fast power hitters with high BABIP and slow contact hitters with low BABIP, that would be wrong. In real life, power hitters tend to strike out and walk a lot, and are usually slow. Fast guys tend to hit fewer homeruns and have higher BABIP. So the first thing I've done is to get Pearson correlations on a bunch of hitting stats for 1995-2005 MLB, each player-season being a single "observation," to use statistical language. I've only included player-seasons with at least 400 plate appearances, to eliminate "cup of coffee" players whose performances will vary widely because of the sample size problem. Here's a guide to the abbreviations of the individual statistical variables: avg=batting average babip=batting average on balls in play kab=strikeouts per at-bat hrab=home runs per at-bat bbpa=walks per plate appearance doab=doubles per at-bat trab=triples per at-bat hppa=hit by pitches per plate appearance sbattg=stolen base attempts per game played Here's the table of MLB correlations. A variable's correlation with itself is of course 1. The number underneath each correlation is the statistical significance level. Numbers under 0.05 indicate that the correlation is statistically significant from zero at the 95% confidence level. Code: \| avg babip kab hrab bbpa doab trab ----------+--------------------------------------------------------------- avg \| 1.0000 \| \| babip \| 0.7792 1.0000 \| 0.0000 \| kab \| -0.3503 0.0820 1.0000 \| 0.0000 0.0001 \| hrab \| 0.2330 -0.0094 0.4258 1.0000 \| 0.0000 0.6511 0.0000 \| bbpa \| 0.1252 0.0966 0.2980 0.4521 1.0000 \| 0.0000 0.0000 0.0000 0.0000 \| doab \| 0.4015 0.3271 -0.0304 0.2185 0.1182 1.0000 \| 0.0000 0.0000 0.1443 0.0000 0.0000 \| trab \| 0.0706 0.1696 -0.1271 -0.3049 -0.1337 -0.1517 1.0000 \| 0.0007 0.0000 0.0000 0.0000 0.0000 0.0000 \| hppa \| 0.0028 0.0232 0.0842 0.0776 -0.0105 0.0789 -0.0525 \| 0.8910 0.2645 0.0000 0.0002 0.6139 0.0001 0.0116 \| sbattg \| 0.0820 0.1633 -0.1497 -0.3107 -0.0310 -0.2057 0.4291 \| 0.0001 0.0000 0.0000 0.0000 0.1364 0.0000 0.0000 \| \| hppa sbattg ----------+------------------ hppa \| 1.0000 \| \| sbattg \| -0.0161 1.0000 \| 0.4379 So we find some interesting things here. Of course batting average is strongly positively associated with BABIP, and negatively associated with strikeouts per at-bat (failure to put balls in play). Nothing surprising there. More interesting findings: 1) Stolen base attempts, a good proxy for speed, is modestly positively associated with BABIP (fast guys get more hits on balls in play, because they can beat out throws more easily), modestly negatively associated with strikeouts, fairly strongly negatively associated with home runs, modestly negatively associated with doubles, and strongly positively associated with triples. 2) Home run power is strongly positively associated with strikeouts and walks (thus, those two are correlated as well, more weakly). Think Ken Phelps. There's also some positive correlation between HR's and doubles. 3) BABIP is somewhat positively associated with doubles, and more weakly, triples. Doubles hitters have apparently learned how to hit it where they ain't. 4) Hit by pitches really aren't very correlated with anything else. Conclusion: There are, roughly speaking, two types of hitter: slow power hitters who strike out and walk a lot, and fast contact hitters who have high BABIP, presumably mostly with singles, don't strike out or walk much, and hit triples. Of course, most players will not fit clearly into either type, because these are just general tendencies, not strict categories. How does OOTP compare to real life? Here's the same table based on the years 2028-2038 from the OOTP sim described earlier, same plate appearance threshold: Code: \| avg babip kab hrab bbpa doab trab ----------+--------------------------------------------------------------- avg \| 1.0000 \| \| babip \| 0.8141 1.0000 \| 0.0000 \| kab \| -0.3858 0.0592 1.0000 \| 0.0000 0.0035 \| hrab \| 0.1661 -0.2119 0.0615 1.0000 \| 0.0000 0.0000 0.0024 \| bbpa \| -0.0294 -0.0488 0.0081 0.0607 1.0000 \| 0.1466 0.0160 0.6901 0.0027 \| doab \| 0.2617 0.2737 -0.1103 -0.1076 0.0049 1.0000 \| 0.0000 0.0000 0.0000 0.0000 0.8077 \| trab \| 0.0935 0.1453 -0.0553 -0.1743 -0.1080 0.1405 1.0000 \| 0.0000 0.0000 0.0063 0.0000 0.0000 0.0000 \| hbppa \| 0.0162 0.0201 0.0149 0.0089 -0.1190 0.0277 0.0263 \| 0.4243 0.3224 0.4635 0.6602 0.0000 0.1712 0.1936 \| sbattg \| 0.0560 0.1407 -0.0125 -0.2044 -0.0551 -0.0864 0.4296 \| 0.0057 0.0000 0.5376 0.0000 0.0065 0.0000 0.0000 \| \| hbppa sbattg ----------+------------------ hbppa \| 1.0000 \| \| sbattg \| 0.0764 1.0000 \| 0.0002 OOTP gets some things basically perfect. The relationship of batting average to BABIP and strikeouts is uncanny. Speed also has the expected relationships with BABIP, homers, and triples. The big deficiency is with the correlation of HR's to walks and strikeouts. This is way too small in OOTP. Also, doubles and triples are modestly positively correlated in OOTP because of the Gap Power concept, but in MLB doubles and triples are negatively correlated. Also, doubles and homers are essentially uncorrelated in OOTP, when they need to be positively correlated. How to fix these issues? Player creation seems to be the place. You could set up a player creation algorithm that makes high-power guys also have good eye and bad avoid k's (and lower-power guys have bad eye and good avoid k's). Markus did say that he would work on this for 2008; it's a complex issue. Next up: correlations among pitcher skills, OOTP vs. MLB. __________________ Heaven is kicking back with a double Talisker and a churchwarden stuffed with latakia. Last edited by Elendil; 06-03-2007 at 01:02 PM.

06-02-2007, 02:30 PM	#2
Elendil Hall Of Famer Join Date: Dec 2003 Location: the dynasty forum Posts: 2,318	So now I do the same with pitchers that I just did with hitters. Here are the statistical categories: babip=Batting Average on Balls in Play Allowed hrip=home runs allowed per inning pitched bbip=walks per inning pitched hbpip=hit by pitches per inning pitched wpip=wild pitches per inning pitched kip=strikeouts per inning pitched gbpct=ground ball percentage (for OOTP sim only) I included only pitchers with at least 162 innings pitched in both the MLB and OOTP samples. Here are the MLB correlations: Code: \| babip hrip bbip hbpip wpip kip ----------+------------------------------------------------------ babip \| 1.0000 \| \| hrip \| 0.0197 1.0000 \| 0.5459 \| bbip \| 0.0156 0.0404 1.0000 \| 0.6327 0.2146 \| hbpip \| 0.0100 0.0229 0.2002 1.0000 \| 0.7576 0.4811 0.0000 \| wpip \| 0.0603 0.0021 0.4357 0.1389 1.0000 \| 0.0635 0.9480 0.0000 0.0000 \| kip \| 0.0186 -0.2812 0.0719 0.0633 0.1631 1.0000 \| 0.5685 0.0000 0.0269 0.0513 0.0000 Only a few aspects of pitcher performance are correlated with each other. Strikeouts are negatively correlated with HR's allowed. Hit by pitches and especially wild pitches are positively correlated with walks (and thus weakly correlated with each other as well). Strikeouts are modestly positively correlated with wild pitches ("effectively wild" pitchers?). Here's how OOTP compares: Code: \| babip hrip bbip hbpip wpip kip gbpct ----------+--------------------------------------------------------------- babip \| 1.0000 \| \| hrip \| 0.0317 1.0000 \| 0.3559 \| bbip \| 0.1080 -0.0396 1.0000 \| 0.0016 0.2491 \| hbpip \| 0.0728 0.1053 0.1267 1.0000 \| 0.0340 0.0021 0.0002 \| wpip \| 0.1409 0.0695 0.2197 0.1270 1.0000 \| 0.0000 0.0431 0.0000 0.0002 \| kip \| 0.1581 0.0359 0.1018 0.0098 -0.0257 1.0000 \| 0.0000 0.2956 0.0030 0.7747 0.4548 \| gbpct \| 0.1278 -0.3808 -0.0336 -0.0565 -0.0187 0.1236 1.0000 \| 0.0002 0.0000 0.3279 0.0999 0.5858 0.0003 For some reason, wilder pitchers (more K's, BB's, WP's, and HBP's) seem to allow a higher BABIP, although this may be a statistical artifact (despite the significance levels). Strikeouts, unfortunately, are not a causal factor in home runs allowed. It just makes sense that if a hitter can't make contact, he can't hit it over the fence. But OOTP doesn't model this yet. On the other hand, control is modeled pretty well, with walks correlating pretty strongly with wild pitches (but could be stronger) and less strongly with HBP's. Strikeouts don't correlate with WP's the way they're supposed to, but this is a small matter, because the real-life correlation is pretty small. The main area to work on seems to be getting high-stuff pitchers to allow fewer HR's. During the beta process, Ronco, Markus, and others discussed changing the "sequence of play" in OOTP. I think Ronco could give a better account of that conversation than I could, because I didn't follow it closely, but I believe OOTP does not determine whether a ball has been successfully put into play before determining whether it is a home run, and Ronco argued that this should be done - it would probably resolve the issue. But there may be a workaround that does the same thing, as Markus argued. Something to consider for '08. __________________ Heaven is kicking back with a double Talisker and a churchwarden stuffed with latakia.

06-02-2007, 02:34 PM	#3
AirmenSmith Hall Of Famer Join Date: Apr 2006 Location: Kalispell, MT Posts: 2,094	I'm sure this is interesting but my attention is to short to read it all __________________ Come join me on my Twitch Channel for gaming at its excellence Twitch Link

06-03-2007, 01:24 PM	#20
RonCo Hall Of Famer Join Date: Aug 2003 Posts: 10,386	I should mention that I think Elendil's commentary about needing more talent changes in certain areas is not quite right. Yes, it could be made to work that way, but in order to get less consistency, we need more _ratings_ changes, not talent changes. In the current OOTP model, ratings follow talent, so perhaps that's saying that ratings need to grow or fade faster than they do in the current OOTP model. "Inconsistency" does not specifically mean the variance must necessarily be both positive and negative. The consternation about talent changes in other areas of the forum are that OOTP talents go down a whole lot, and up almost never. So this dynamic is not being used to induce inconsistency. We worked hard on the basic growing and fading of each skill in v2007, and I think that work is showing well. But v2008 can be made better by addressing the year-to-year consistency question, and I suggest that fine-tuning the growth process at the rating level, and improving the fielding model are the design items that will be most useful in this area. Oscillating talent will not do much to affect year-to-year stats consistency unless the ratings development rate would be made much, much more rapid, which would in turn cause many more issues. Last edited by RonCo; 06-03-2007 at 01:25 PM.

06-02-2007, 04:16 PM	#9
Eugene Church Hall Of Famer Join Date: Aug 2002 Posts: 36,036	Thank you for sharing your research with us.

06-02-2007, 04:28 PM	#10
Assos All Star Starter Join Date: May 2004 Location: Just on the fair side of the foul pole! Posts: 1,772	Assos is proud to award you a Ph.D in Baseball Stats

06-02-2007, 04:54 PM	#11
injury log Hall Of Famer Join Date: Apr 2007 Location: Toronto Posts: 9,162	Really fascinating stuff. I'm digesting the Pearson correlations now, but I'd love to get a walk-through on one of the Hausman tables- the interweb isn't being very helpful as a stats refresher. Only if you have a chance- I hate to ask, considering all the careful work you've put into preparing and presenting the data.

06-02-2007, 11:21 PM	#13
RonCo Hall Of Famer Join Date: Aug 2003 Posts: 10,386	Nice work, as always.

06-03-2007, 10:58 AM	#15
Mattingly06 Minors (Rookie Ball) Join Date: Mar 2006 Posts: 45	Thanks Eledil, great job. Any chance you can crunch some numbers on minor leaguers? I always wondered how good the development engine is in this game.

06-03-2007, 11:44 AM	#18
Mike Donlin All Star Reserve Join Date: Sep 2004 Location: Life, friends, is boring. Posts: 840	Thanks for the work. It's good to see OOTP mostly getting these things right.