Thanks again for the quick response.
My software background says to check the integrity of the data first then the software logic which utilizes it, to avoid chasing your tail.
I edited my post because I type very poorly and so much slower than I think, the point I was trying to make was
Quote:
|
That formula would not create an inverted positional preference. (e.g. Jackson played 59 games in CF and 35 in LF in real life, but in the neutralized data set he plays 75 in LF and still only plays 59 in CF).
|
There are lots of examples of low real-life games being dropped, both outfield and infield, in the neutralization process. (McGraw and Wallace of the 1901 set come to mind - it was their early years not 1901 season). Jackson was the 2nd player I looked at in my neutralized test of the beta patch, so maybe I was very lucky, but in the on-line leagues I see lots of these examples.
Quote:
|
And yes, for players with very few AB or IP or Defensive Innings, we did pro-rate them based on their career rates so that OOTP would generate proper ratings from these stats. This was done to avoid sample size issues when generating ratings.
|
This was my assumption, but the fact the OOTP does not create ratings needs to be looked at by the OOTP developers. I will go back to the base premise that if a player plays a position in a year he should be rated for that year. Regardless of the what is discovered in Jackson, he should have a CF rating (and I would argue a RF rating as well).
I'll do some more analysis to see what is there. I can relate to the problem of checking 17,000+ players. I am assuming you use a T-Test or other statistical method for accuracy, but still not fool-proof at the player/season level to be sure.