View Single Post
Old 05-20-2004, 02:23 PM   #2
chode
Bat Boy
 
Join Date: Oct 2003
Posts: 13
I am interested to hear any ideas as well, since I am doing the same thing.

Right now my plan is to just randomly select players from history, then import them individually from the Ankit database. The major problem is of course adjusting for different eras (so Derek Jeter isn't better than Honus Wagner). My "solution" is to take the average rates for each year and normalize players stats to those. I only do it for the year I am importing though.

For example, for each year in MLB history, I calculate the factors needed to normalize the rates of hits, 2B, 3B, HR, BB, and SO to the average over MLB history. For 2003, the factor for HR is probably something like 0.6. So when I import a hitter from 2003, I multiply his stats by those factors. If I import a guy with 19 HRs, I would multiply that number by 0.6, which would represent, in theory, the number of HRs that guy would hit in an average year. I do the same with pitchers as well.

I only do this for the year that I import the guy - not for any previous or later years. This is due to the amount of time it would take to do all years (I also don't bother with guys who didn't play very much). This is not the optimal solution, but it should help somewhat.

If someone has created a properly normalized database equivalent to what I am doing half-assed, I haven't heard about it. But I am certainly far from an active member of this community.
chode is offline   Reply With Quote