View Single Post
Old 11-25-2004, 03:18 AM   #14
arod23
Major Leagues
 
Join Date: Apr 2004
Location: Philadelphia, PA
Posts: 378
Database comparison

Regarding the rookie year issue and ctorgs proposed database, I wanted to see how players imported depending on the different databases. Using Babe Ruth, I compared Lahman 5.2, ctorg's adjusted DB (season plus career avg / 2), Ankit's DB, and Ankit's CareerAvg DB.

A review of the issues:
- Lahman has some missing data. AnkitDB fills in some of those holes, also cuts down the size of the database by eliminating anyone with a short career, i.e., the non essential players.
- Players importing in their rookie years from Lahman do not develop as expected, primarily due to players' limited number of games in their debut years. We know that there is a chance that players will not 'peform' like they did in real life, but for many Lahman imports there is no chance, including Ruth
- AnkitDB solves some of these problems by removing those early years and importing players when they have a sufficient number of games, but for many players that is a few years after their actual debut year
- However, some players in AnkitDB who have a sufficient number of games import with very low ratings because of poor early careers. Even though talent figures might be promising, they won't get the opporunity to play
- Ankit's career avg DB removes the risk of not developing by importing players with their career avg figures so early poor years have little impact, but like his regular DB the players do not import in their actual debut year
- Note, Ankit suggests using his regular DB when starting a historical league then using his career avg DB to import rookies
- Ctorg proposes to create a 'compromise-type of DB' by having players debut in their actual debut year but also having a chance at developing.

Ruth Rookie Year Import Analysis:
- See table below
- Remember that Ankit's rookie years are based on when a player has a reasonable number of games, so Ruth imports in 1918 rather than 1914.
- Because Ruth only played in 5 games as a hitter in 1914, Lahman imports him with very low ratings and no talent. No chance of progressing to anything close to Ruth the hitter
- Ctorg's ratings are better, maybe too high for a rookie, but the talent ratings are appropriate.
- AnkitDB's ratings are decent for Ruth as a rookie, not incredibly high contact or power but certainly a very good prospect. Of course, I believe this was Ankit's intent when he eliminated early years in a player's career. Talent ratings are appropriate.
- Note that AnkitDB talent ratings are similar to ctorg's. Keep in mind that if you imported a player over and over again, the ratings will not be the same each time, they are close, within a few points, but they do vary. So, ctorg and Ankit's talent ratings are really the same, which makes sense because they are based on roughly the same career totals (there is also a small difference because Ankit cut out Ruth's early years and ctorg does not).
- Ankit's CareerAvgDB imports Ruth at a level higher than his regular DB and also higher than ctorg's, maybe too high for Ruth as a rookie. Talent numbers are appropriate and similar to ctorg's and AnkitDB, again because they are all based on the same career.

HTML Code:
Rookie Year Ratings							
DB             	Yr	Age	Contact	Gap	Power	Eye	Avoid K's
lahman      	1914	19	14	21	3	1	10
ctorg        	1914	19	82	79	100	134	61
AnkitDB     	1918	23	65	122	56	101	57
AnkCarAvg	1918	23	92	79	110	145	65

Rookie Year Talent							
DB             	Yr	Age	Contact	Gap	Power	Eye	Avoid K's
lahman      	1914	19	1	1	1	1	105
ctorg        	1914	19	96	86	108	147	61
AnkitDB     	1918	23	96	84	113	149	62
AnkCarAvg	1918	23	97	86	109	154	62

Conclusions:
- Lahman unadjusted will not produce good results. This is NOT because the database is bad (or as some posts say Lahman sucks). The database is great, accurate, and useful (and free remember). However, the way OOTP uses it may create historical sim inaccuracies beyond what we would expect from random fluctuations.
- Ankit's DB works well for most players, especially if you adjust the development modifiers, so players can have a chance at reaching their potential. The only negatives are that players do not debut in their actual debut years and solid career players who started out slow (having sufficient playing time) may not import with good enough ratings - this is not Ankit's fault just a reality, and Ankit minimizes a lot of this
- Ankit's career DB as an import for rookies allows solid players with poor early careers to have a chance. Negatives are the debut year is not accurate and players' may import at too high a level
- ctorg's proposed DB seems like a reasonable solution by combining all of the good qualities of Ankit's DBs and reducing the negatives (by the way Ankit's pros far outweigh the cons). Players import in their actual debut years. Players should import with reasonable ratings, maybe high in some cases, but better since the ratings will be in between Ankit's DB and career avg DB. It will also differentiate players who had subpar rookie years but solid careers. Players with a lot of playing time in their rookie years will look the same as Ankit's career avg DB imports, but players with limited playing time or below career average performance in their rookie year will import at more appropriate levels thus taking a little longer to develop (but at least having a chance to) similar to real life. And since OOTP seems to factor in career averages rather than peak years, talent ratings will not be affected. That is, ctorg's 'smoothing' adjustment to a player like Ruth (his 60 HRs in 1927 would be adjusted to 46) does not affect the way the talent ratings are calculated (because his 4 HRs in 1915 would be adjusted to 18). In other words, Ruth's career totals and averages in ctorg's DB would be the same as Ruth's actual career totals and averages.

- The only negative with ctorg's is that it's not done yet
arod23 is offline