Home | Webstore
Latest News: OOTP 25 Available - FHM 10 Available - OOTP Go! Available

Out of the Park Baseball 25 Buy Now!

  

Go Back   OOTP Developments Forums > Prior Versions of Our Games > Earlier versions of Out of the Park Baseball > Earlier versions of OOTP: Historical Simulations

Reply
 
Thread Tools
Old 06-10-2009, 09:39 PM   #1
swampdragon
Hall Of Famer
 
swampdragon's Avatar
 
Join Date: May 2002
Location: The Lonely Mountain
Posts: 2,509
DIPS, Defense, and 1974: A Case Study

Taking this outside as suggested by Steve P. I think the issues pstrickert and others are discussing are important. Let me theorize for a moment and see whether there's general agreement.

1. OOTP is very good at modeling the attributes it is designed to model, provided the correct totals modifiers are being used. Garlon has convinced me of that multiple times.
2. Nonetheless, OOTP is not very good at recreating the dynamics of a pennant race replay. If accuracy means the results will be close to real life, it's not terribly accurate, at least as far as runs allowed are concerned.
3. The key stat driving that differential is hits allowed, both for individual pitchers and for teams as a whole. DIPs takes that stat away from the indiviual pitchers.
4. The available stats in Lahman are limited as far as defense is concerned.
5. Unless we're going to pretend that the original results were mostly luck and that there's no problem, the task for historical OOTP has to be improving the importing or modifying of defensive ability. Everyone with me so far?
__________________
“Of all tyrannies, a tyranny sincerely exercised for the good of its victims may be the most oppressive. It would be better to live under robber barons than under omnipotent moral busybodies." -- C.S. Lewis
swampdragon is offline   Reply With Quote
Old 06-11-2009, 07:10 AM   #2
old timer
Hall Of Famer
 
old timer's Avatar
 
Join Date: May 2002
Posts: 2,278
Quote:
Originally Posted by swampdragon View Post
4. The available stats in Lahman are limited as far as defense is concerned.
5. Unless we're going to pretend that the original results were mostly luck and that there's no problem, the task for historical OOTP has to be improving the importing or modifying of defensive ability. Everyone with me so far?
I just did some tests where I tweaked the A's and Dodgers defensive ratings for a few of their players that I thought were off and both teams BABIP and overall performances were similar to real life. This happened in multiple tests. Without these adjustments, both teams BABIP were much higher than in real life and both generally performed far worse too.

In other words, defensive ratings do seem to be a big problem, but how to fix? Does anyone aside from Markus even know how the ratings are calculated upon importing?

If we had direct access to the game database, we could make a utility that could modify the ratings based on our own algorithm. I don't think he's going to give us such access, however. So short of asking Markus to improve his algorithm and hoping he looks at it, is there anything practical that we can do that would help him improve the defensive ratings?

Last edited by old timer; 06-11-2009 at 07:12 AM.
old timer is offline   Reply With Quote
Old 06-11-2009, 08:48 AM   #3
swampdragon
Hall Of Famer
 
swampdragon's Avatar
 
Join Date: May 2002
Location: The Lonely Mountain
Posts: 2,509
Quote:
Originally Posted by old timer View Post
I just did some tests where I tweaked the A's and Dodgers defensive ratings for a few of their players that I thought were off and both teams BABIP and overall performances were similar to real life. This happened in multiple tests. Without these adjustments, both teams BABIP were much higher than in real life and both generally performed far worse too.

In other words, defensive ratings do seem to be a big problem, but how to fix? Does anyone aside from Markus even know how the ratings are calculated upon importing?

If we had direct access to the game database, we could make a utility that could modify the ratings based on our own algorithm. I don't think he's going to give us such access, however. So short of asking Markus to improve his algorithm and hoping he looks at it, is there anything practical that we can do that would help him improve the defensive ratings?
First things first. My suggestion would be to see if an accurate "season disk" can be created for 1974 that would touch nothing but defensive ratings, and would do that objectively, using the metrics on baseball reference, or fielding win shares, or other available advanced stats. If it could be done for that season, then the methods could be generally applied to other seasons for which those advanced stats were available.

It's also possible that the way to go, which would be doable within the Lahman database, is to develop a "team defense" metric, within which all of the defensive ratings would be modified by a set percentage according to team runs allowed. Or, if you've read the Bill James book on Win Shares, you'll remember his efforts to break those down between pitching and fielding. He has multiple formulae that he used to do that. Rather than reinventing the wheel, I suspect we could find a place on the net that broke those down for every team in baseball history. We could correlate those with the existing defensive ratings. I'm open to suggestions from interested parties. As I said, the first test would be to see whether we could use objective criteria to get 1974 right.
__________________
“Of all tyrannies, a tyranny sincerely exercised for the good of its victims may be the most oppressive. It would be better to live under robber barons than under omnipotent moral busybodies." -- C.S. Lewis
swampdragon is offline   Reply With Quote
Old 06-11-2009, 10:24 AM   #4
knockahoma
All Star Reserve
 
Join Date: Dec 2002
Posts: 792
Putting on my future cap, I suspect big changes in attitude are coming at BABIP.

TOT/YR is a stat that works poorly in conjunction with BABIP. Bill James, in a 2006, article admitted scratching his head over defensive stats that had strange variance. We "missing" something, he said.

If you examine the TOT/YR, you'll see strange fluctuations that leave only a few inferences regarding the cause of those wide and sudden variances in good fielders:

1. Injury
2. Chance

Or the 3rd inference, which challenges current BABIP philosophy-- the pitchers are exerting much more influence than currently believed on balls in play.

Last edited by knockahoma; 06-11-2009 at 10:39 AM.
knockahoma is offline   Reply With Quote
Old 06-11-2009, 10:27 AM   #5
knockahoma
All Star Reserve
 
Join Date: Dec 2002
Posts: 792
Quote:
First things first. My suggestion would be to see if an accurate "season disk" can be created for 1974 that would touch nothing but defensive ratings, and would do that objectively, using the metrics on baseball reference, or fielding win shares, or other available advanced stats.
I'd love to see a "season disk". In fact, I re-edit 74 all the time for that. I'd suggest using the views of actual scouts and coaches as part of the equation, too.

STRAT-O-MATIC has had an excellent rep with professional baseball players over the decades. They dig into stats, but temper that with scouts, or tv commentators on their pay-roll.

I think that's important. Bill James writes about the shadow of the monster, how much is missing from the fielding math that we currently have. He says what's missing is important.

In other words, Math without Eyes may be as bad as Eyes without Math.

Last edited by knockahoma; 06-11-2009 at 10:31 AM.
knockahoma is offline   Reply With Quote
Old 06-11-2009, 11:58 AM   #6
swampdragon
Hall Of Famer
 
swampdragon's Avatar
 
Join Date: May 2002
Location: The Lonely Mountain
Posts: 2,509
Quote:
Originally Posted by knockahoma View Post
I'd love to see a "season disk". In fact, I re-edit 74 all the time for that. I'd suggest using the views of actual scouts and coaches as part of the equation, too.

STRAT-O-MATIC has had an excellent rep with professional baseball players over the decades. They dig into stats, but temper that with scouts, or tv commentators on their pay-roll.

I think that's important. Bill James writes about the shadow of the monster, how much is missing from the fielding math that we currently have. He says what's missing is important.

In other words, Math without Eyes may be as bad as Eyes without Math.
We don't have eyes for many of these seasons, and a season disk for 1974 would only be valuable beyond that season if it was based on math so that it could have a wider application. Markus is committed to DIPs, and it seems to have majority support in this community. If we're going to improve the OOTP experience for historical play, it's going to have to be within that framework. That's a high-level decision above our pay grade.

Individualized season quickstarts might be fun, and they'd probably be easier to do than what I have in mind, but you'd lose the career play. Still, I can see the advantages in the approach. Do you have a quickstart that works for 1974?
__________________
“Of all tyrannies, a tyranny sincerely exercised for the good of its victims may be the most oppressive. It would be better to live under robber barons than under omnipotent moral busybodies." -- C.S. Lewis
swampdragon is offline   Reply With Quote
Old 06-11-2009, 12:07 PM   #7
pstrickert
Hall Of Famer
 
pstrickert's Avatar
 
Join Date: Dec 2005
Posts: 15,726
Markus said (as of today) that he'll work on the fielding problem for Patch #2. It would definitely help if we had some specific, detailed recommendations for him.
pstrickert is offline   Reply With Quote
Old 06-11-2009, 01:35 PM   #8
swampdragon
Hall Of Famer
 
swampdragon's Avatar
 
Join Date: May 2002
Location: The Lonely Mountain
Posts: 2,509
Quote:
Originally Posted by pstrickert View Post
Markus said (as of today) that he'll work on the fielding problem for Patch #2. It would definitely help if we had some specific, detailed recommendations for him.
The easiest thing to do (not that it's all that easy) would be a team defense concept that adjusted (or recalced) the individual defenders by whatever percentage was necessary to get a team's BABIP to what it should be. Obviously you'd have to adjust for park, and possibly some other things as well. But you wouldn't need a new database.
__________________
“Of all tyrannies, a tyranny sincerely exercised for the good of its victims may be the most oppressive. It would be better to live under robber barons than under omnipotent moral busybodies." -- C.S. Lewis
swampdragon is offline   Reply With Quote
Old 06-11-2009, 01:45 PM   #9
magnet
Hall Of Famer
 
Join Date: Jun 2003
Posts: 5,029
Quote:
Originally Posted by swampdragon View Post
The easiest thing to do (not that it's all that easy) would be a team defense concept that adjusted (or recalced) the individual defenders by whatever percentage was necessary to get a team's BABIP to what it should be. Obviously you'd have to adjust for park, and possibly some other things as well. But you wouldn't need a new database.
I guess my followup to this would be; Will this system only work on leagues that import players to their real-life teams? If the team is only half of who was really there in 1974, wouldn't the real-life team BABIP be essentially useless?
magnet is offline   Reply With Quote
Old 06-11-2009, 02:12 PM   #10
RonCo
Hall Of Famer
 
Join Date: Aug 2003
Posts: 9,499
To support historicals the way many want to play them, OOTP really needs a season-by-season set of rosters that sets defensive ratings. Someone could do that in their "spare time" by loading up the game, then doing a roster export/import (I think defense can be adjusted that way, anyway ... it works in v9, so I assume it works in X). Then save the game as a quickstart asnd post that.

It's a lot of effort, but could be worth it to the community if a few folks were to undertake it.
RonCo is offline   Reply With Quote
Old 06-11-2009, 03:07 PM   #11
swampdragon
Hall Of Famer
 
swampdragon's Avatar
 
Join Date: May 2002
Location: The Lonely Mountain
Posts: 2,509
Quote:
Originally Posted by magnet View Post
I guess my followup to this would be; Will this system only work on leagues that import players to their real-life teams? If the team is only half of who was really there in 1974, wouldn't the real-life team BABIP be essentially useless?
That would be correct.
__________________
“Of all tyrannies, a tyranny sincerely exercised for the good of its victims may be the most oppressive. It would be better to live under robber barons than under omnipotent moral busybodies." -- C.S. Lewis
swampdragon is offline   Reply With Quote
Old 06-11-2009, 03:36 PM   #12
swampdragon
Hall Of Famer
 
swampdragon's Avatar
 
Join Date: May 2002
Location: The Lonely Mountain
Posts: 2,509
Quote:
Originally Posted by swampdragon View Post
That would be correct.
Which means that the easiest way to do this probably doesn't work for the majority of players. Which gets us back to needing better defensive imports and the limitations of working within the Lahman database. I'm getting discouraged.
__________________
“Of all tyrannies, a tyranny sincerely exercised for the good of its victims may be the most oppressive. It would be better to live under robber barons than under omnipotent moral busybodies." -- C.S. Lewis
swampdragon is offline   Reply With Quote
Old 06-11-2009, 03:56 PM   #13
magnet
Hall Of Famer
 
Join Date: Jun 2003
Posts: 5,029
Quote:
Originally Posted by swampdragon View Post
Which means that the easiest way to do this probably doesn't work for the majority of players. Which gets us back to needing better defensive imports and the limitations of working within the Lahman database. I'm getting discouraged.
I hope I didn't discourage anything; if this project works it would be a great addition, and make a lot of player's experience that much more enjoyable.
magnet is offline   Reply With Quote
Old 06-11-2009, 04:15 PM   #14
thehef
Hall Of Famer
 
Join Date: Jun 2006
Posts: 4,838
Hey Oldtimer, I'm curious as to which '74 Dodgers players you tweaked. I'm guessing Russell and Garvey were two... Also, did you find that you needed to do anything to the bullpen since Marshall was used far more often (and for more innings) than any AI would be likely to use him?
thehef is offline   Reply With Quote
Old 06-11-2009, 06:13 PM   #15
StyxNCa
Hall Of Famer
 
StyxNCa's Avatar
 
Join Date: Dec 2004
Location: Victoria, Texas
Posts: 3,136
Quote:
Originally Posted by RonCo View Post
To support historicals the way many want to play them, OOTP really needs a season-by-season set of rosters that sets defensive ratings. Someone could do that in their "spare time" by loading up the game, then doing a roster export/import (I think defense can be adjusted that way, anyway ... it works in v9, so I assume it works in X). Then save the game as a quickstart asnd post that.

It's a lot of effort, but could be worth it to the community if a few folks were to undertake it.
The question is how to adjust them. I have never seen, though I have asked, for some kind of thing telling me how much adjustment is needed for such and such a result, especially for errors. I wouldn't mind adjusting my league if I had some kind of guideline to use.
StyxNCa is offline   Reply With Quote
Old 06-11-2009, 06:24 PM   #16
old timer
Hall Of Famer
 
old timer's Avatar
 
Join Date: May 2002
Posts: 2,278
It just occurred to me that a program could be made to experiment with defensive ratings. If you export the team rosters as a text file, the program could then read the Lahman database that comes with the game, come up with the new ratings and then modify the roster file for reimportation. That would remove the tedium of hand inputting the ratings.

Of course, the hard part would then be coming up with an algorithm for using the Lahman stats to come up with ratings that are consistently superior to what the game comes up with.
old timer is offline   Reply With Quote
Old 06-11-2009, 06:28 PM   #17
old timer
Hall Of Famer
 
old timer's Avatar
 
Join Date: May 2002
Posts: 2,278
I could write such a program, but probably not the algorithm for deriving the superior ratings. In other words, I could do the easy part.

If someone without programming skills can figure out the hard part, I could program it. Of course, if someone can program and has ideas on how to better rate the players, that would be even better.
old timer is offline   Reply With Quote
Old 06-11-2009, 10:41 PM   #18
old timer
Hall Of Famer
 
old timer's Avatar
 
Join Date: May 2002
Posts: 2,278
I'm still testing things out in game to get a feel for how changes to defensive ratings can impact a team and I thought I'd share the results.

I made changes (upward) to Campaneris' range and arm ratings and to Green's as well. In RL, Green didn't even play full time that year, but the AI uses him as the starter all year. Even changing just Green's or Campaneris' defensive ratings, but not both, made a noticeable improvement in the A's pitching outcomes.

Those were the only changes made in the whole league. The A's team pitching stats more closely resemble RL and the team consistently wins the division (all 6 times - small sample size, I know). Except for usage, the individual pitchers (on the A's) also performed closer to RL. However, I'm not suggesting these are the changes needed to make the A's play more like RL. I was merely interested in how "little" changes could affect certain outcomes.

I wonder how much better the game can do with rating players. Short of people doing the ratings manually (like roster sets in other games), does anyone believe the in-game ratings can be made much better?

Last edited by old timer; 06-12-2009 at 02:49 AM. Reason: GB% data was invalid due to small sample size.
old timer is offline   Reply With Quote
Old 06-12-2009, 12:26 AM   #19
swampdragon
Hall Of Famer
 
swampdragon's Avatar
 
Join Date: May 2002
Location: The Lonely Mountain
Posts: 2,509
Quote:
Originally Posted by old timer View Post
I'm still testing things out in game to get a feel for how changes to defensive ratings can impact a team and I thought I'd share the results.

I made changes (upward) to Campaneris' range and arm ratings and to Green's as well. I noticed that Holtzman's HA were always better than RL and Hunter's were worse, so I switched their GB% (where do those numbers come from anyway?). Holtzman went from 51 to 47 and Hunter from 47 to 51. I figured since I improved the middle infield defensive ratings, this would lower Hunter's HA and raise Holtzman's HA and that's what has happened so far in 6 tests.

In RL, Green didn't even play full time that year, but the AI uses him as the starter all year. Even changing Just Green's or Campaneris' defensive ratings, but not both, made a noticeable improvement in the A's pitching outcomes. Also, those small changes in GB% made more of an impact than I expected, but maybe it wouldn't have if I had run more tests. So MANY variables, each of which can have a significant impact on the outcome.

Those were the only 4 changes made in the whole league. The A's team pitching stats more closely resemble RL and the team consistently wins the division (all 6 times - small sample size, I know). Except for usage, the individual pitchers (on the A's) also performed closer to RL. However, I'm not suggesting these are the changes needed to make the A's play more like RL. I was merely interested in how "little" changes could affect certain outcomes.

I wonder how much better the game can do with rating players. Short of people doing the ratings manually (like roster sets in other games), does anyone believe the in-game ratings can be made much better?
Your experience does suggest that if they could be made better, that the game would indeed come closer to replicating real life. So we would be tinkering with a real change rather with a cosmetic one. It also suggests that relatively minor changes in a position or two might do the job. Now, if we only knew which ones...

I think we should give the upcoming patch a try, since Markus has said he will try to improve defensive imports. Then we can analyze the imports vs. what we know about the players.
__________________
“Of all tyrannies, a tyranny sincerely exercised for the good of its victims may be the most oppressive. It would be better to live under robber barons than under omnipotent moral busybodies." -- C.S. Lewis
swampdragon is offline   Reply With Quote
Old 06-12-2009, 02:58 AM   #20
old timer
Hall Of Famer
 
old timer's Avatar
 
Join Date: May 2002
Posts: 2,278
Just wanted to note that I edited my post above regarding GB%. The GB% didn't seem to matter at all (at least I couldn't see any effect) after running many more tests. I should know better than to post results from so few tests.

Nevertheless, the two defensive changes did make a big difference and I'm hoping that a few such adjustments on each team (if necessary) will improve the replay. I'm going to see how much improvement can be made in the '74 replay without touching hitting and pitching ratings, but will hopefully resist posting results before sufficient testing has been done.
old timer is offline   Reply With Quote
Reply

Bookmarks

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -4. The time now is 05:40 AM.

 

Major League and Minor League Baseball trademarks and copyrights are used with permission of Major League Baseball. Visit MLB.com and MiLB.com.

Officially Licensed Product – MLB Players, Inc.

Out of the Park Baseball is a registered trademark of Out of the Park Developments GmbH & Co. KG

Google Play is a trademark of Google Inc.

Apple, iPhone, iPod touch and iPad are trademarks of Apple Inc., registered in the U.S. and other countries.

COPYRIGHT © 2023 OUT OF THE PARK DEVELOPMENTS. ALL RIGHTS RESERVED.

 

Powered by vBulletin® Version 3.8.10
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
Copyright © 2020 Out of the Park Developments