|
||||
| ||||
|
|
#1 |
|
All Star Reserve
Join Date: Feb 2002
Location: Singapore
Posts: 603
|
Extra! Extra! Read all about it!
My latest DB based on Lahman 5.2 has now been released. You can download it here: http://www.baseballmaelstrom.com/ankit/AnkitDB52.zip A lot of things have been changed, including players debuting when they originally debuted. Changes are as above: Changes I made to Lahman DB 5.2 to create AnkitDB 5.2: ================================================== ==== There are lot of changes listed below but keep in mind that when you do historical sims, the results will be accurate, with one difference. The ERAs of pitchers in the Dead Ball will be slightly higher than they were in real life, but not too much. Everything else will be completely in line, including stolen bases. Instructions for using this DB are in a text file called "How To Use This.txt", included in the zip file. Enjoy! Note: Fielding Range, Arm, and PB ratings in OOTP are assigned randmomly. They have nothing to do with the DB, either this one or the original Lahman DB. Also, if you need additional help in figuring out how to do historical sims accurately with this DB, let me know. If you have any comments or suggestions or criticisms, I can be reached at ankitpayal@hotmail.com Ankit Desai 01/23/2005 MINOR CHANGES ------------- 01. Removed entries for those that never played. 02. Added debut dates for players missing them. 03. Removed the players who did not play a position; just had a few pinch hit appearances. 04. Removed all batters who debuted prior to 2003 and had less than 100 career at-bats. 05. Removed all pitchers who debuted prior to 2003 and had less than 33.1 career IP. 06. Removed all players who never played in the NL or AL, post-1900. 07. Removed all players who debuted in 2003 or 2004 and had less than 40 career at-bats. 08. Removed all pitchers who debuted in 2003 or 2004 and had less than 10 career IP. 09. Removed all non-pitcher fielding entries for pitchers, provided they played less than 60 games total at other positions. 10. Removed pitching related entries for position players from both pitching and fielding files, if they pitched fewer than 300 innings total. Eight players were still left, see Change #14 for more. MAJOR CHANGES ------------- 11. Added strikeouts for the following players: anyone who played in the AL from 1901 to 1912 and anyone who played in the NL from 1897 to 1909. Previously, players had 0 for strikeouts because the data was unavailable. I analyzed strikeouts by batters from 1913 to 1920 and have made it so that good hitters strikeout less while poor hitters strikeout a lot more often. 12. Deleted all stats prior to 1895 and also all Federal League stats. The reason for this is that the level of competition in the Federal League was not up to par with the NL/AL and if anything stats from that league need to be discounted. Prior to 1894, there were a lot of rules that are vastly different from the modern game, for example, 50 feet between the mound and the plate, calling for where the pitch should be thrown by batters, etc. 13. Caught Stealing data for players from 1895 to 1950 was available for some players but not for most. I recalculated and added CS numbers based on available data from 1951 - 1992. This will lead to better assignment of speed and stealing ability ratings. 14. Separated careers of eight players who pitched and were a position player for substantial amount of time. A list of those 8 players is at the end of this file. The reason for the separation is that OOTP 6 does not assign talent ratings on the hitting side if you are a pitcher and vice-versa. As a result, if Babe Ruth is imported in 1914, 1915, 1916, or 1917, he will import as a Pitcher and will have talent ratings for his pitching skills. However, his batting talents will be what is assigned to all pitchers--1 for everything, except avoiding Ks, it is 105. Obviously this problem had to be solved somehow. I know there are some who don't like this solution but this is the only way OOTP 6 will make this situation work. 15. Standardized batting stats to a 790-run environment using a method devised by Bill James. He uses this in his "The New Bill James Historical Abstract" to understand a player's batting ability in a more normal context. For example, if a player hits .238, with 10 HRs and 57 RBI while playing a pitcher's park and where runs are hard to come by. Is this player any better than another player who hits .265 with 12 HRs and 72 RBI? The player in question is Willie Davis in 1965, where runs were at a premium and he played in pitcher friendly Dodgers Stadium. The .265/12/72 RBI are Willie's stats standardized were he playing in an average park and where the league as a whole would score about 750 runs per season. I moved this to 790 to more closely reflect the scoring rate over the past 10 years and also because OOTP 6's engine understands this better. The reason I chose to do this is because OOTP's engine does not compensate for stats under different eras and as a result, will assign ratings that are somewhat misleading. If you want to read more about it, see page 740 (Willie Davis comment) in James' book. If you don't understand why I did this or what effect this has on the game, then feel free to e-mail me or post questions to me in the OOTP forums. I did this to everyone, including pitchers. 16. The fielding information for all players is now based on career averages. If a player played at least 1% of his career games at a position, he will be rated for that position. 17. Career averages for games played at an OF position in the FieldingOF.csv file. 18. Batting stats for pitchers are now based on career totals, per 500 at-bats. All seasons will have these stats as OOTP 6 does not model pitcher's hitting abilities, so different numbers are useless. 19. Batting stats for position players are now based on career totals, per 500 at-bats. All seasons, except the rookie season, will have these stats. For their rookie season, they keep their original stats if they had over 500 at-bats. If they had less, then the numbers are 75% of their per-500 at-bats ratios. For example: Barry Bonds' Career Averages per 500 at-bats: 159 H, 33 2B, 4 3B, 41 HR, 30 SB, 8 CS, 134 BB, 76 K, .318 Avg His Rookie season, when he had less than 500 ABs: 119 H, 25 2B, 3 3B, 31 HR, 30 SB, 8 CS, 101 BB, 95 K, .238 Avg His H, 2B, 3B, and BB are 75% of his career average. His SB and CS remain the same. His K's are 125% of his career average. 20. Pitching stats are now based on a pitcher's performance relative to the league and the year, taking their park factor in to account. They were then totalled for their career and then based on a standard universe in which pitchers struck out 6.5 per 9 IP, walked 3.5 per 9 IP, etc. These ratios are the MLB ratios for the past 10 years and thus what the OOTP's engine is best designed for. In short, I did something similar to the pitchers like I did to the batters, which is explained above in #15. 21. Batting and Pitching stats are now a players career average except for their rookie season, in which case it is this: For a pitcher: if a pitcher pitched at least 150 IP, he retains those stats, else they are 15% worse than their career average. For a position player: if they had at least 400 AB, they keep their stats, else they are 15% worse than their career averages. This allows for player development yet at the same time limits some of the problems inherent in OOTP's development model (young players not developing their power or contact rating fast enough). ================================================== ===================== The duplicate players: playerID Pos G Name Career ----------------------------------------------------------------------- bressru01 OF Rube Bressler 1919,1921-1932 bressra01 P 107 Ray Bressler 1914-1920 callani01 OF Nixey Callahan 1897,1902-1913 callaja01 P 195 James Callahan 1897-1902 coonejo01 OF Johnny Cooney 1935-1944 coonejo02 P 159 John Cooney 1921-1930 jeffcha01 OF Hal Jeffcoat 1948-1953 jeffcha02 P 245 Harold Jeffcoat 1954-1959 russere01 P Reb Russell 1913-1919 russeew01 OF 138 Ewell Russell 1922-1923 ruthba01 OF Babe Ruth 1918-1935 ruthge01 P 163 George Ruth 1914-1919 woodjo02 OF Joe Wood 1918-1922 woodsm01 P 225 Smokey Joe Wood 1908-1917 yeagejo01 3B Joe Yeager 1899-1908 yeagejo02 P 94 Joseph Yeager 1898-1902
__________________
|
|
|
|
|
#3 |
|
All Star Reserve
Join Date: Apr 2004
Posts: 706
|
Very Interesting. Good job and thanks for creating this!
Although I don't usually do historical replay, I think I will give this a try sometime. Before I ask a few questions about point #15, I will have to admit I don't own "The New Bill James Historical Abstract" (i should definitely cook up the money for a hardcover version) so I don't know the content of the book well. Please don't flame me for not reading the book first and then ask the questions here. Anyway, the questions are the following: Is the method meant to neturalize the effect of park factor? If it is so, can you expain how does the method manage to achieve this? Also, is the method meant to normalized the scoring environment between different years? So, by using this method, the league average scoring will always be (after adjustion) 790 runs per season, right? Thanks in advance for answering my questions. |
|
|
|
|
#4 |
|
All Star Reserve
Join Date: Apr 2002
Posts: 595
|
Thanks Ankit! Your outstanding work greatly benefits this community.
|
|
|
|
|
#5 |
|
Hall Of Famer
Join Date: Sep 2003
Location: Los Angeles
Posts: 3,417
|
thanks Ankit
do we still use the career average dB that you made with your new AnkitDB 5,2 when importing rookies? and is AnkitDB 5.2 does what years? thanks Last edited by jbmagic; 01-24-2005 at 02:38 AM. |
|
|
|
|
#6 | |
|
All Star Reserve
Join Date: Feb 2002
Location: Singapore
Posts: 603
|
Quote:
__________________
|
|
|
|
|
|
#7 | |
|
All Star Reserve
Join Date: Feb 2002
Location: Singapore
Posts: 603
|
Quote:
In the Willie Davis example, the park effects are neutralized this way: 790 runs is the league scoring average. Willie Davis in 1965 created 49.6 runs. Runs Created = ( (H + BB) * TB ) / ( AB + H + BB ). The Dodgers in 1965 played in a league where the average runs per team was 4.03/game. The Park factor for the Dodgers was 0.90 (0.76 at home). Thus 162 games times 4.03 runs per game times 0.90 (park factor) = 588 runs. That is the Dodgers run scoring environment. Since Davis created 50 runs in that environment, we find out how many he would have created in a 790 run environment this way: 790 / 588 * 49.6 = 66.75 or 67. Now we go back to his Run Created formula and solve for 67 RC while representing his BB, TB, and AB in terms of H, i.e. BB = 14, or in H terms: 14/133 = .105H. Solve the whole thing algebraically, for each player, every team, and every season and it done. If you need a more complete example let me know, and pick a player and the season yourself so I can explain this using another player.
__________________
|
|
|
|
|
|
#8 | |
|
Minors (Rookie Ball)
Join Date: Jan 2005
Posts: 35
|
Quote:
Couple of questions: 1) I'm doing a one year historical sim with some friends and we drafted 20 teams of historical players from the beginning of baseball to 2004. We chose to use the 3rd best season based on the salaries assigned to the players at What If Sports. I've been using the Lahman 5.2 and importing each player manually and by the corresponding season. I'm a newbie to OOTP so I don't know exactly how everything is generated in terms of ratings of players from different eras so after reading point 15 I was a bit discouraged. Do you see any major problem with the results of our sim based on the current method? Should we switch over to your database? 2) How does OOTP generate fielding ratings? During the importing of the players, I've seen some pretty low defensive ratings for highly regarded defensive players. Will your 5.2 version be more accurate? Last edited by Metsui; 01-24-2005 at 10:43 AM. |
|
|
|
|
|
#9 |
|
Minors (Triple A)
Join Date: Sep 2004
Posts: 209
|
Ankit -
What do you suggest for league totals and era settings. Does recalc. for hist. accuracy work in your opinion? Thanks |
|
|
|
|
#10 |
|
All Star Starter
Join Date: Sep 2003
Posts: 1,571
|
And the only Database in OOTP worth playing just keeps getting better! Great work as always!!!
|
|
|
|
|
#11 | |
|
All Star Reserve
Join Date: Feb 2002
Location: Singapore
Posts: 603
|
Quote:
2. Arm Strength and Range are generated randomly. They are not at all based on any DB. Same goes for PB, I think they default to 4 for all catchers. Fielding percentage is imported from the DB and uses the percentage for a player for the year he was imported in. My DB uses a career average so it won't matter which year you import a player in. Same goes for the position a player can play at. Example, if Jimmy Foxx is imported in his rookie season using the Lahman, he will import as a Catcher only. If you import his using my DB, he will be a First Baseman with the ability to play C.
__________________
|
|
|
|
|
|
#12 | |
|
All Star Reserve
Join Date: Feb 2002
Location: Singapore
Posts: 603
|
Quote:
__________________
|
|
|
|
|
|
#13 | |
|
Hall Of Famer
Join Date: Sep 2003
Location: Los Angeles
Posts: 3,417
|
Quote:
do we use Ankit DB 5.2 when we import rookies after season is over? and i tought the DB does 1901-2004? Last edited by jbmagic; 01-24-2005 at 01:22 PM. |
|
|
|
|
|
#14 |
|
All Star Reserve
Join Date: Feb 2002
Location: Singapore
Posts: 603
|
Yes, use AnkitDB 5.2 for importing rookies. The DB has stats for players from 1901 to 2004 but don't create a historical league if you want to sim 2004, use one of the 2004 league made by others.
__________________
|
|
|
|
|
#15 |
|
Hall Of Famer
Join Date: Mar 2002
Posts: 3,765
|
Ankit...glad to see you back!
You should have PMed me, I would have helped with a readme on fielding range, arm and passed balls As for reclac for historical accuracy...I disagree...I think it works very well. A little editing to your DB in regards to players ratings really helps with that...the talent is great for their career averages, but some players come in better and then fade. While others come in a little lower and rise to their career averages. That in correlation with a little OOTP randomness makes any player have a career or slump year.
__________________
"I am at that stage of my life where I keep myself out of arguments. I am 100% self sufficient spiritually, emotionally & financially. Even if you say 1+1=5, you are ABSOLUTELY CORRECT. Enjoy!" |
|
|
|
|
#16 |
|
Hall Of Famer
Join Date: Feb 2002
Location: Effingham, IL
Posts: 5,725
|
Ankit,
You mention in the readme that it is ok to use, say, SoxMan's historical stadiums, including the dat files. I'm just wondering if you are talking about the dat files that make everything 100 or the files that are actually varied. |
|
|
|
|
#17 | |
|
All Star Reserve
Join Date: Feb 2002
Location: Singapore
Posts: 603
|
Quote:
__________________
|
|
|
|
|
|
#18 | |
|
All Star Reserve
Join Date: Feb 2002
Location: Singapore
Posts: 603
|
Quote:
__________________
|
|
|
|
|
|
#19 |
|
Hall Of Famer
Join Date: Sep 2003
Location: Los Angeles
Posts: 3,417
|
Ankit and others
when using the DB be sure to goto league setup, Misc and be sure you say yes to "allow cpu teams to sign and release players" Its Set to No on default, when creating Historical League With Ankit DB or Lathman DB if you dont the sim will stop a lot and you wont be able to continue until you do the fix for the cpu teams. Also be sure you goto league setting, Basic setup and change League Mode from Replay to Career, if you want to continue to next year Last edited by jbmagic; 01-24-2005 at 08:29 PM. |
|
|
|
|
#20 |
|
All Star Reserve
Join Date: Apr 2004
Posts: 706
|
Ankit,
Thanks for the explanation. I think I got it. ![]() Although I couldn't help but wonder that couldn't you use a more accurate player run estimater (XR or extrapolated run come to mind) other than basic RC (basic RC is more for team run estimation)? Well, I guess that minor details are not that important here. |
|
|
| Bookmarks |
|
|