Pitcher usage in the pre-reliever era, part III

joefromchicago · 09-22-2020, 10:09 PM

REPLAY 1: THE 1977 EXPOS

This is the third installment in my continuing series on how OOTP uses pitchers in the pre-reliever era. If you haven't seen my previous threads, you can find them here and here. In this current iteration I used OOTP 21 - for the previous threads I used versions 19 and 20 respectively. In part, therefore, the intention here was to see if anything has changed in the latest version.

My initial idea was to look past the pre-reliever era and see how OOTP handles the years when baseball was transitioning from stoppers to closers. For those unfamiliar with how OOTP uses those terms, a "stopper" is a reliever who is brought into high-leverage situations late in the game, while a "closer" is a reliever who is specifically brought into a game, usually in the ninth inning, in a save situation. A stopper, therefore, might be brought into a game that is "on the line," even if it's not necessarily a save situation.

I used the game's "out of the box" settings. For 1977, this meant: (a) five-man rotations and five-man bullpens; (b) starting rotation mode is "start highest rested;" (c) starters not allowed in relief; (d) use of relievers set to "often;" (e) use of closers set to "sometimes;" (f) pitcher stamina set to "normal."

For this test, I chose the 1977 Expos. Coming out of spring training, manager Dick Williams said that his pitching staff consisted of Steve Rogers "and a bunch of other guys." He wasn't kidding. Rogers was clearly the standout on the staff. The rest ranged from mediocre to awful. I was particularly interested in one pitcher: Gerry Hannahs. In real life, Hannahs went straight from double-A to the big club in late 1976 and came out of spring training in 1977 having won a spot in the starting rotation. His OOTP ratings, however, are pretty bad, and the AI had Dan Schatzeder and Dan Warthen in the starting rotation instead. Schatzeder, in fact, didn't join the team until the rosters expanded in September while Warthen started the season in the bullpen.

I wanted to see if the game would use Hannahs as an emergency starter, which is the role he would fall into when it became apparent that he didn't "have the goods." OOTP doesn't do very well at finding a place for pitchers who straddle the line between the rotation and the bullpen. The game allows relievers to be assigned "emergency SP" as their primary or secondary roles, but I've never seen the AI actually do that, and the AI certainly didn't do it in my 1977 replay. And since the league setting prevented starters from appearing in relief, there was very little opportunity for a pitcher to cross over between starting and relieving.

Hannahs, in real life, had eight appearances, with seven coming in starts, and by late May he was back in the minors. In my replay, I had Hannanhs in the starting rotation at the start of the season, and he was, predictably, not good. I then shifted him to the bullpen, hoping that, as a pitcher with a 60 stamina (20-80 scale), the AI might use him as an emergency starter. That didn't happen. I then designated his primary role as "emergency SP." That didn't help either. In all, Hannahs compiled three starts, with none coming after June 12. He had 29 appearances after that date, all in relief, even though, for a large part of that time, he was designated as an "emergency SP." I guess when much of your rotation is a disaster, "emergency" is sort of relative.

The other interesting pitcher on the Expos staff was Don Stanhouse. Some of you may remember him as a reliever with Baltimore in the late '70s, but he's another guy who, earlier in his career, switched back and forth between the starting rotation and the bullpen. In 1977, he started 16 games for the Expos and relieved in 31. His first 12 appearances were all starts, but by the end of May Williams had seen enough of Stanhouse and his 4.79 ERA and demoted him to the bullpen. He picked up a few starts in June and July but by August he was a full-time reliever and by September he was the team's closer. That's the sort of progression that I wanted to see if OOTP could replicate. It didn't. Stanhouse ended up being the number 2 starter all season long and didn't log a single relief appearance.

In large part, that's due to the fact that no one else had better ratings than he did. And that, in fact, turned out to be the second biggest takeaway from this replay: the ratings are, at times, bafflingly inexplicable. I can't argue with the game's basic pitching ratings - those all work together in some mysterious fashion to produce realistic results. But the stamina ratings and the "current role" designations, at times, make no apparent sense. In general, stamina ratings were way too low for pitchers who actually started games - and this is true not just for 1977 but for earlier years that I've looked at. Hannahs, for instance, has a 60 stamina rating with a current role as "strictly bullpen," while Stanhouse has a 35 stamina rating and a current role as "starter." That despite the fact that Hannahs was primarily a starter in 1977 and Stanhouse was, by the end of the season, exclusively a reliever.

In my experience, it appears that OOTP downgrades the stamina ratings of starters who don't pitch very many innings. I suppose the reasoning is that giving them higher stamina ratings would make them better than they actually were. To me, though, that doesn't make much sense. If they started in the big leagues, their managers must have believed that they had enough gas in the tank to last at least five or six innings and possibly more. Hal Dues, for instance, a pitcher who started four games for the Expos in 1977 as a September call-up, gets a 25 stamina rating in OOTP. But Dues wasn't pulled from games because he was tired, he was pulled because he was bad. In my replay, he barely had a cup of coffee, pitching only 3.1 innings, all in relief. There's really no reason to start Dues, but a guy like that should nevertheless be able to pitch into the late innings. In OOTP, therefore, Dues's low stamina results in him being a bad pitcher, when, in reality, it was his bad pitching that caused him to log so few innings. His stamina rating, in other words, doesn't reflect his stamina so much as it reflects his overall pitching ability. That, to my thinking, unfairly dings him twice: he gets bad pitching ratings and a low stamina for what is really just his bad pitching.

But, as I mentioned above, that is only the second biggest takeaway from this replay. The biggest is that, at least for 1977, the game does not determine who will start based on how rested the pitcher is the day before the game. That was a big revelation to me in my second set of replays with OOTP 20, as it became clear when playing the 1920 White Sox that a pitcher who may be only slightly tired today will not get the start tomorrow, even if he will be 100% rested. This meant that starters who could easily start with three days of rest would, in OOTP, still only pitch on four-days rest. That, in turn, meant that the top starters in the rotation would end up getting too few starts while starters at the bottom of the rotation would get too many.

In my 1977 replay, however, I found to my surprise that this rule didn't seem to apply. Although it would normally take Rogers, with an 80 stamina rating, four days of rest to recover from a start, he would actually pitch on the fifth day. In other words, even if Rogers was slightly tired today, he would still get the start tomorrow when he would be 100% rested. That would be a big step forward for the game if the same thing applied to earlier eras, where starters commonly started on two or three days of rest but where OOTP consistently forces them to take an extra day or two of rest between starts. That merited a closer look ....

joefromchicago · 09-22-2020, 10:13 PM

REPLAY 2: THE 1912 SENATORS

For this replay, I chose the 1912 Washington Senators. Like the 1977 Expos, the Senators had a pitching staff composed of Walter Johnson and "a bunch of other guys." As was common in those days, Johnson was used both as the top starter and as a high-leverage reliever. In fact, Johnson's 37 starts were only second to teammate Bob Groom's 40 starts. That's because Johnson also made 13 appearances out of the bullpen. Together, then, Johnson and Groom accounted for 50% of the team's starts. The rest were parcelled out among nine other hurlers. In part, that churn among the bottom half of the staff was the result of front-office moves to bolster the staff, as it became clear during the season that the Nats had a shot at the league title. As it turned out, the club finished second, but they were still a distant 14 games behind the pennant-winning Red Sox.

The OOTP settings for 1912 are: (a) four-man rotations and three-man bullpens; (b) starting rotation mode is "strict order;" (c) starters allowed in relief; (d) use of relievers set to "very rarely;" (e) use of closers set to "very rarely;" (f) pitcher stamina set to "high." As I've explained in my previous threads, OOTP's rotation settings are unrealistic for the pre-reliever era. I use six-man rotations and, for 1912, I set the number of relievers to zero. I did, however, set the number of position players at 13 and the roster size at 20, which gives the game one "open" position that can be filled with either a position player or a pitcher. In my experience, when given this kind of choice, the AI almost always chooses an extra pitcher, and this replay proved no exception.

The rotation setting is also unrealistic. "Strict order" would give far too many starts to the bottom half of the staff. As noted above, two pitchers accounted for half of all the team's starts. I suppose Johnson and Groom would duplicate that result if the club had a four-man rotation that was set to a strict order, but those settings would yield grossly unrealistic totals for the rest of the staff.

As I mentioned, there was some turnover in personnel as the club sought to bolster the pitching staff down the pennant stretch. As a result, I decided to play with historical transactions on. There's a trade-off here. On one hand, using historical transactions should have helped distribue innings among the staff on a more historical basis. On the other hand, having historical transactions on meant that no games were cancelled or postponed because of weather and there were no random injuries. As I've mentioned in my previous threads, one reason why starters at the bottom of a team's staff would get the occasional start was because of the large number of double-headers that teams played in that era, and those double-headers, in turn, were largely the result of rainouts. That's a trade-off I was willing to make, given that my primary aim was to see how the game handled pitchers' rest.

My 1977 replay revealed that the game wasn't forcing starters to take an extra day of rest. That, however, was with a league stamina setting of "normal" and pitchers with high stamina needing four days of rest between starts. How would the game handle pitchers with high stamina who only needed three days of rest? The answer: the same as before. It quickly became apparent to me that OOTP was still making starting assignments the day before the game. Walter Johnson, an absolute beast who had an 80 stamina and who completed 32 of his 36 starts in my replay, recovered from starts more-or-less according to the following schedule:

Day 0 (day of game): 0%
Day 1: 15%
Day 2: 45%
Day 3: 70%
Day 4: 100%

The other regular starters had similarly high stamina ratings and recovered at about the same rate (the biggest difference was on the first day after a start). As can be seen, Johnson could easily start with three days of rest, even if he threw over 125 pitches in a game. That didn't happen. Instead, the game was still forcing the Washington starters to take an extra day of rest between starts, which yielded the same result that I had seen in my previous replays: the top half of the rotation started too few games and the bottom half started too many. Here is the breakdown showing the difference between the real life (RL) stats and the stats from my replay (sim), with pitchers listed in roughly the order they appeared in the rotation (* denotes that the player was on the team for only part of the season):

Walter Johnson
(RL) 50 G, 37 GS, 369.0 IP
(sim) 36 G, 36 GS, 324.2 IP

Bob Groom
(RL) 43 G, 40 GS, 316.0 IP
(sim) 34 G, 34 GS, 288.1 IP

Tom Hughes
(RL) 31 G, 26 GS, 196.0 IP
(sim) 33 G, 30 GS, 269.0 IP

Carl Cashion
(RL) 26 G, 17 GS, 170.1 IP
(sim) 31 G, 25 GS, 220.0 IP

Joe Engel*
(RL) 17 G, 10 GS, 75.0 IP
(sim) 10 G, 10 GS, 67.1 IP

Hippo Vaughn*
(RL) 12 G, 8 GS, 81.0 IP
(sim) 10 G, 9 GS, 83.0 IP

Dixie Walker*
(RL) 9 G, 8 GS, 60.0 IP
(sim) 8 G, 5 GS, 55.1 IP

Jerry Akers*
(RL) 5 G, 1 GS, 20.1 IP
(sim) 3 G, 3 GS, 14.2 IP

Barney Pelty*
(RL) 11 G, 4 GS, 43.2 IP
(sim) 13 G, 1 GS, 30.2 IP

Paul Musser*
(RL) 7 G, 2 GS, 20.2 IP
(sim) 5 G, 1 GS, 14.0 IP

I had the bench coach set the staff. Johnson, Groom, and Hughes were the top three starters from beginning to end. Cashion and Engel both fluctuated between the fourth and fifth spots, depending on whether someone better was around. For instance, when the team acquired Vaughn in June, Cashion and Engel both moved down a rung on the ladder to fifth and sixth respectively. So the fourth spot was basically Cashion-Vaughn-Engel, while the fifth spot was take-your-pick Cashion-Engel-Pelty-Walker-Musser.

There were several times during the season when I was down to a five-man or even four-man staff, but there was never an occasion when the game had to start someone who wasn't fully rested. Undoubtedly that wouldn't have been the case if I was playing with rainouts and injuries enabled, but, like I said, that was a conscious decision on my part. Still, the results were in line with my previous replays. The top two starters combined for seven fewer starts than they did in real life while the next three starters (Hughes, Cashion, Vaughn) gained thirteen. Cashion, who spent a large part of his time as the number 5 starter, was the biggest gainer with eight more starts.

continued....

joefromchicago · 09-22-2020, 10:15 PM

1912 SENATORS REPLAY (continued)

How did this play out during the season? The following chart shows the starting assignments in my replay:

1 - Johnson
2 - Groom
3 - Hughes
4 - Cashion
5 - Vaughn
6 - any other starter
0 - off-day
(xx) - double-header

APRIL
- - - - 1 2 3
0 4 1 2 3 6 1
0 2 3 6 1 6 2
0 3 1

MAY
- - - 4 2 6 3
0 1 2 4 3 6 1
0 2 3 4 1 6 2
0 3 1 6 2 6 3
0 1 2 4 (36) 0

JUNE
- - - - - - 1
2 4 6 1 2 3 4
5 1 2 3 4 6 1
2 0 3 6 1 2 4
0 3 1 2 5 4 3
0

JULY
- 1 2 5 (34) 6 1
0 0 2 3 1 5 4
0 2 1 3 5 3 2
0 1 3 4 2 0 1
3 5 2 6

AUGUST
- - - - 1 3 5
2 4 1 3 5 2 4
1 0 3 2 5 1 4
0 2 3 1 5 4 2
0 1 3 4 2 6 1

SEPTEMBER
0 (34) 2 6 1 3 4
0 0 1 2 3 4 6
1 0 2 4 6 1 6
2 0 0 1 3 2 4
0 1

OCTOBER
- - 3 0 2 4 1

Focusing on Johnson's starts, there's only one instance where he started with three days of rest. That was between his second and third starts in April. I think that was because the team used a couple of starters in relief during that span, so there may have been some rest issues with the fifth and sixth starters. But, as I mentioned, Johnson was fully rested when he started with three days of rest. The only reason that we should have expected him to take an extra day off is because of the way OOTP handles starters in this era.

How, then, would the starts shake out if the game allowed the top starters to start with three days of rest? Calculating for Johnson, Groom, and Hughes, all of whom have 75-80 stamina ratings, the chart would look like this:

1 - Johnson
2 - Groom
3 - Hughes
4 - generic fourth starter
5 - generic fifth starter
0 - off-day
(xx) - double-header

APRIL
- - - - 1 2 3
0 1 2 3 4 1 2
0 3 1 2 4 3 1
0 2 3

MAY
- - - 1 4 2 3
0 1 2 3 4 1 2
0 3 1 2 4 3 1
0 2 1 3 4 2 1
0 3 2 1 (45) 0

JUNE
- - - - - - 2
1 4 3 2 1 4 3
2 1 4 3 2 1 4
3 0 1 2 3 4 1
0 2 3 1 4 2 3
0

JULY
- 1 2 3 (45) 1 2
0 0 1 2 3 4 1
0 2 3 1 4 2 3
0 1 2 3 4 0 1
2 3 4 1

AUGUST
- - - - 2 3 4
1 2 3 4 1 2 3
4 0 1 2 3 4 1
0 2 3 1 4 2 3
0 1 2 3 4 1 2

SEPTEMBER
0 (34) 1 2 5 3 1
0 0 2 1 3 4 2
1 0 3 2 1 4 3
2 0 0 1 2 3 4
0 1

OCTOBER
- - 2 0 1 3 2

This yields the following totals for game starts:

Johnson - 43
Groom - 42
Hughes - 38
4th starter - 28
5th starter - 3

At first glance, this chart looks a lot like the four-man strict rotation that I rejected at the start of this test. That's true, but then that's because it's an artificial construct. It doesn't take into account real-life factors like injuries, rainouts/double-headers, and, perhaps most importantly, the use of starters as relievers.

That last point merits some closer analysis....

joefromchicago · 09-22-2020, 10:20 PM

1912 SENATORS REPLAY (continued)

As noted previously, Walter Johnson was used as a high-leverage reliever in addition to his starting duties. In my replay, however, Johnson never appeared in relief. His stamina, however, should allow him not only to start with three days of rest, but it should also allow him to pitch in relief occasionally, either when he's fully rested or when he is slightly tired. In 1912, for instance, the Big Train started four games when he had pitched the day before in relief. I don't think that's even possible in OOTP, but it should be. Nevertheless, Johnson's use as a reliever probably meant that he lost about three or four starts over the season, and those starts went to pitchers lower in the rotation like Akers and Engel. The schedule and the way pitchers were used in dual roles, therefore, had a way of spreading out starts among the staff that can't be duplicated using a four- or five-man rotation.

In my replay, not only did Johnson not pitch in relief, but the other top starters (Groom, Hughes, and Cashion) combined for only nine relief appearances. Compare that to their real-life counterparts who together made forty trips from the bullpen in 1912. Instead of filling in as relievers in my replay, those pitchers were busy throwing complete games. That tracks my previous experience in which pitcher stamina paradoxically proved to be both too high and too low. Too high because pitchers were completing too many games and too low because the top starters weren't starting as often as they should. My Senators completed 111 games - an impressive 72% completion rate. The real-life Senators managed only 98 CGs (64%), which itself was higher than the MLB rate of 58% (which translates to 89.5 CGs).

Part of the problem may be the game's pinch-hitter setting. The out-of-the-box setting for "pinch hit for pitchers" is "very rarely." I'm not sure if that's realistic. Pinch-hitting stats are not broken out for 1912, so I looked at the box scores for each of the 56 games where the Senators used two or more pitchers. Of those, manager Clark Griffith replaced a pitcher with either a pinch-hitter or a pinch-runner in 30 games (including Walter Johnson, who pinch-hit three times - that guy was a machine!). Now, in some of those games the replacement came in the Senators' final at-bat, so pinch-hitting for the pitcher didn't result in a pitching change. Still, this gives some sense as to how often a team in this era might use pinch-hitters.

By modern standards, Washington definitely used pinch-hitters sparingly. Part of the reason for this, however, may simply be due to the fact that the club didn't have a very deep bench. One of the advantages to using historical transactions is that it gives a pretty accurate picture of how many players were on a team's roster on any given day. My league settings gave each team a 20-man roster, but Washington rarely suited up 20 players for a game. Usually it was 18 or 19, and sometimes it dipped below 17. That not only restricted the availability of pinch-hitters, but I suspect that the AI kept starters in the game because it didn't think that there were a lot of pitchers available to relieve either.

From a modern perspective, that makes sense. But at a time when pitchers were expected to start and relieve, the AI should deem every pitcher on the bench to be potentially available as a reliever. A starter with an 80 stamina who is "slightly tired" is likely still a better choice to relieve than a has-been with a 35 stamina who has been relegated to the bullpen because he's too flammable to start. That the AI always prefers the has-been is reflective of the AI's modern-day mindset. If the AI thought more like Clark Griffith and less like Joe Madden, it might make sense to boost the pinch-hitter setting from "very rarely" to "rarely." That, at least, might go a ways toward shaving a few complete games off the team's totals.

Another point that needs mentioning is the "Hal Dues" problem that I brought up in connection with my 1977 Expos replay. As I noted, OOTP seems to penalize starters who didn't log a lot of innings by giving them low stamina ratings, even though this has the effect of penalizing them twice for being, in effect, bad pitchers. This problem is even worse in the pre-reliever era, where just about every pitcher was expected to be a starter.

A good example of this is Paul Musser, who appeared in seven games for Washington in 1912, twice as a starter. His first appearance and start was on June 6. He pitched well for five innings but got into a bases-loaded jam in the sixth and Griffith brought in Johnson to finish the game. The Washington Evening Star gushed over his performance, saying that "Musser proved beyond question that he is entitled to be worked regularly from this out, and that he will win a majority of his games." That proved overly optimistic, as Musser was roughed up in his next start and Griffith exiled him thereafter to the bullpen. Musser, it's true, wasn't very good, but the point here is that the Senators expected him to be a starter. As such, he should have the stamina to be a regular member of the rotation. Indeed, Musser had a long career as a starter in the minor leagues. OOTP, however, gives him a 35 stamina. But Musser didn't have low stamina - at least not in the eyes of his contemporaries. He just couldn't get the ball consistently over the plate, giving up 16 hits and 16 (!) walks in 20.2 innings. In short, he wasn't tired, he was just lousy.

The problem here is two-fold: not only does OOTP penalize pitchers like Musser twice for being bad, but giving low stamina ratings to these types of pitchers creates an unrealistic split between starters and relievers that simply didn't exist in the pre-reliever era. The AI looks at a guy like Musser and his 35 stamina and says "he's only good as a reliever, I'll put him into the bullpen." Then the AI proceeds to use that reliever like it's 2020 rather than 1912. Far better to give a guy like Musser a decent stamina so that he can fill in, when necessary, as either a starer or reliever - just as he did in real life.

luckymann · 09-23-2020, 05:08 AM

Fascinating stuff, brilliant!

I'm currently doing my first "historical" sim, with the quotes invoked because it is a random debut league. There is certainly some strange paradoxes I am seeing with regard to each of the things you mention. Starters will often go 10-15 innings despite there being a bullpen full of relievers, most of whom see very little action throughout the course of a season. And at the start of each season I have to go through and edit injury proneness settings to reduce fragility among SP, of which close enough to 50% are given a FRAGILE rating.

We fall into the trap of thinking that just because OOTP is such a fantastic game it should / can do EVERYTHING perfectly. That is unrealistic. Still, it would be nice to see some improvements toward a more accurate while still workable replication of IRL scenarios.

joefromchicago · 09-23-2020, 10:52 AM

Quote:

Originally Posted by joefromchicago

By modern standards, Washington definitely used pinch-hitters sparingly. Part of the reason for this, however, may simply be due to the fact that the club didn't have a very deep bench. One of the advantages to using historical transactions is that it gives a pretty accurate picture of how many players were on a team's roster on any given day. My league settings gave each team a 20-man roster, but Washington rarely suited up 20 players for a game. Usually it was 18 or 19, and sometimes it dipped below 17. That not only restricted the availability of pinch-hitters, but I suspect that the AI kept starters in the game because it didn't think that there were a lot of pitchers available to relieve either.

I want to return to this point because I don't think I explained it very well. For 1912, the standard OOTP setting for "pinch hit for pitchers" is "very rarely." In my replay, though, I think that the pinch-hitter setting, combined with the team's shallow bench, resulted in an effective setting of "almost never." I don't know if this is indeed what was happening - the interaction between the pinch-hitter setting and the roster sizes was not on my radar when I went into this replay - but it seems logical. If the pitcher's spot in the lineup is coming up and there's only one guy on the bench available to pinch hit, I would imagine that the AI takes that into consideration when determining whether to pinch hit for the pitcher or not. Bumping up the league setting to "rarely" or even "normal" might be effective in counterbalancing this situation. I don't know, but that's something I will definitely look at in the future.

Another way to approach this problem would be through manager strategies. Again, that was not something that I was focused on for this replay, but I discussed it at some length in connection with my 1922 Tigers replay. Since that replay OOTP has done nothing that I'm aware of with regard to manager strategies, even though OOTP, in the recent past, radically revised strategy setting options, going from 36 game-situation sliders to 16. It seems to me that this is an area of the game that needs far more attention than it is receiving, as it could be a key component in making OOTP teams from the pre-reliever era actually behave like teams in that era did.

luckymann · 09-24-2020, 02:07 AM

Joe, any chance you can re-up Part 1 of this series, as the link seems to be broken, perhaps because it was on the old board?

joefromchicago · 09-24-2020, 08:43 AM

Quote:

Originally Posted by luckymann

Joe, any chance you can re-up Part 1 of this series, as the link seems to be broken, perhaps because it was on the old board?

Hmmmm, the links I put in my first post work for me, and they both have the "forums" address rather than the old "www" address. Try this:

https://forums.ootpdevelopments.com/...d.php?t=288830

luckymann · 09-24-2020, 10:55 AM

Quote:

Originally Posted by joefromchicago

Hmmmm, the links I put in my first post work for me, and they both have the "forums" address rather than the old "www" address. Try this:

https://forums.ootpdevelopments.com/...d.php?t=288830

got them thanks, although the first one went to the www address. I just tinkered with it and got there eventually!

Bub13 · 09-25-2020, 06:13 PM

Very interesting, JfC. I don't usually play historical, but if I do, this is food for thought.

Also: the "Hal Dues Problem" made me go look up his career stats, as I somehow remembered him having a solid "2-8" rating in Statis-Pro back in '78. And sure enough ... a 2.36 ERA in 99 IP for Mr. Dues. Where I pulled that info from I'll never know...

09-22-2020, 10:09 PM	#1
joefromchicago Hall Of Famer Join Date: Jun 2011 Posts: 3,726	Pitcher usage in the pre-reliever era, part III REPLAY 1: THE 1977 EXPOS This is the third installment in my continuing series on how OOTP uses pitchers in the pre-reliever era. If you haven't seen my previous threads, you can find them here and here. In this current iteration I used OOTP 21 - for the previous threads I used versions 19 and 20 respectively. In part, therefore, the intention here was to see if anything has changed in the latest version. My initial idea was to look past the pre-reliever era and see how OOTP handles the years when baseball was transitioning from stoppers to closers. For those unfamiliar with how OOTP uses those terms, a "stopper" is a reliever who is brought into high-leverage situations late in the game, while a "closer" is a reliever who is specifically brought into a game, usually in the ninth inning, in a save situation. A stopper, therefore, might be brought into a game that is "on the line," even if it's not necessarily a save situation. I used the game's "out of the box" settings. For 1977, this meant: (a) five-man rotations and five-man bullpens; (b) starting rotation mode is "start highest rested;" (c) starters not allowed in relief; (d) use of relievers set to "often;" (e) use of closers set to "sometimes;" (f) pitcher stamina set to "normal." For this test, I chose the 1977 Expos. Coming out of spring training, manager Dick Williams said that his pitching staff consisted of Steve Rogers "and a bunch of other guys." He wasn't kidding. Rogers was clearly the standout on the staff. The rest ranged from mediocre to awful. I was particularly interested in one pitcher: Gerry Hannahs. In real life, Hannahs went straight from double-A to the big club in late 1976 and came out of spring training in 1977 having won a spot in the starting rotation. His OOTP ratings, however, are pretty bad, and the AI had Dan Schatzeder and Dan Warthen in the starting rotation instead. Schatzeder, in fact, didn't join the team until the rosters expanded in September while Warthen started the season in the bullpen. I wanted to see if the game would use Hannahs as an emergency starter, which is the role he would fall into when it became apparent that he didn't "have the goods." OOTP doesn't do very well at finding a place for pitchers who straddle the line between the rotation and the bullpen. The game allows relievers to be assigned "emergency SP" as their primary or secondary roles, but I've never seen the AI actually do that, and the AI certainly didn't do it in my 1977 replay. And since the league setting prevented starters from appearing in relief, there was very little opportunity for a pitcher to cross over between starting and relieving. Hannahs, in real life, had eight appearances, with seven coming in starts, and by late May he was back in the minors. In my replay, I had Hannanhs in the starting rotation at the start of the season, and he was, predictably, not good. I then shifted him to the bullpen, hoping that, as a pitcher with a 60 stamina (20-80 scale), the AI might use him as an emergency starter. That didn't happen. I then designated his primary role as "emergency SP." That didn't help either. In all, Hannahs compiled three starts, with none coming after June 12. He had 29 appearances after that date, all in relief, even though, for a large part of that time, he was designated as an "emergency SP." I guess when much of your rotation is a disaster, "emergency" is sort of relative. The other interesting pitcher on the Expos staff was Don Stanhouse. Some of you may remember him as a reliever with Baltimore in the late '70s, but he's another guy who, earlier in his career, switched back and forth between the starting rotation and the bullpen. In 1977, he started 16 games for the Expos and relieved in 31. His first 12 appearances were all starts, but by the end of May Williams had seen enough of Stanhouse and his 4.79 ERA and demoted him to the bullpen. He picked up a few starts in June and July but by August he was a full-time reliever and by September he was the team's closer. That's the sort of progression that I wanted to see if OOTP could replicate. It didn't. Stanhouse ended up being the number 2 starter all season long and didn't log a single relief appearance. In large part, that's due to the fact that no one else had better ratings than he did. And that, in fact, turned out to be the second biggest takeaway from this replay: the ratings are, at times, bafflingly inexplicable. I can't argue with the game's basic pitching ratings - those all work together in some mysterious fashion to produce realistic results. But the stamina ratings and the "current role" designations, at times, make no apparent sense. In general, stamina ratings were way too low for pitchers who actually started games - and this is true not just for 1977 but for earlier years that I've looked at. Hannahs, for instance, has a 60 stamina rating with a current role as "strictly bullpen," while Stanhouse has a 35 stamina rating and a current role as "starter." That despite the fact that Hannahs was primarily a starter in 1977 and Stanhouse was, by the end of the season, exclusively a reliever. In my experience, it appears that OOTP downgrades the stamina ratings of starters who don't pitch very many innings. I suppose the reasoning is that giving them higher stamina ratings would make them better than they actually were. To me, though, that doesn't make much sense. If they started in the big leagues, their managers must have believed that they had enough gas in the tank to last at least five or six innings and possibly more. Hal Dues, for instance, a pitcher who started four games for the Expos in 1977 as a September call-up, gets a 25 stamina rating in OOTP. But Dues wasn't pulled from games because he was tired, he was pulled because he was bad. In my replay, he barely had a cup of coffee, pitching only 3.1 innings, all in relief. There's really no reason to start Dues, but a guy like that should nevertheless be able to pitch into the late innings. In OOTP, therefore, Dues's low stamina results in him being a bad pitcher, when, in reality, it was his bad pitching that caused him to log so few innings. His stamina rating, in other words, doesn't reflect his stamina so much as it reflects his overall pitching ability. That, to my thinking, unfairly dings him twice: he gets bad pitching ratings and a low stamina for what is really just his bad pitching. But, as I mentioned above, that is only the second biggest takeaway from this replay. The biggest is that, at least for 1977, the game does not determine who will start based on how rested the pitcher is the day before the game. That was a big revelation to me in my second set of replays with OOTP 20, as it became clear when playing the 1920 White Sox that a pitcher who may be only slightly tired today will not get the start tomorrow, even if he will be 100% rested. This meant that starters who could easily start with three days of rest would, in OOTP, still only pitch on four-days rest. That, in turn, meant that the top starters in the rotation would end up getting too few starts while starters at the bottom of the rotation would get too many. In my 1977 replay, however, I found to my surprise that this rule didn't seem to apply. Although it would normally take Rogers, with an 80 stamina rating, four days of rest to recover from a start, he would actually pitch on the fifth day. In other words, even if Rogers was slightly tired today, he would still get the start tomorrow when he would be 100% rested. That would be a big step forward for the game if the same thing applied to earlier eras, where starters commonly started on two or three days of rest but where OOTP consistently forces them to take an extra day or two of rest between starts. That merited a closer look .... __________________ American-Ethnic (and Canadian) Namesets Historical Minor League Schedules 1870s City/Team Nickname Randomizers "It's Usually Sunny in Philadelphia" weather mod Negro League Schedules

09-22-2020, 10:13 PM	#2
joefromchicago Hall Of Famer Join Date: Jun 2011 Posts: 3,726	REPLAY 2: THE 1912 SENATORS For this replay, I chose the 1912 Washington Senators. Like the 1977 Expos, the Senators had a pitching staff composed of Walter Johnson and "a bunch of other guys." As was common in those days, Johnson was used both as the top starter and as a high-leverage reliever. In fact, Johnson's 37 starts were only second to teammate Bob Groom's 40 starts. That's because Johnson also made 13 appearances out of the bullpen. Together, then, Johnson and Groom accounted for 50% of the team's starts. The rest were parcelled out among nine other hurlers. In part, that churn among the bottom half of the staff was the result of front-office moves to bolster the staff, as it became clear during the season that the Nats had a shot at the league title. As it turned out, the club finished second, but they were still a distant 14 games behind the pennant-winning Red Sox. The OOTP settings for 1912 are: (a) four-man rotations and three-man bullpens; (b) starting rotation mode is "strict order;" (c) starters allowed in relief; (d) use of relievers set to "very rarely;" (e) use of closers set to "very rarely;" (f) pitcher stamina set to "high." As I've explained in my previous threads, OOTP's rotation settings are unrealistic for the pre-reliever era. I use six-man rotations and, for 1912, I set the number of relievers to zero. I did, however, set the number of position players at 13 and the roster size at 20, which gives the game one "open" position that can be filled with either a position player or a pitcher. In my experience, when given this kind of choice, the AI almost always chooses an extra pitcher, and this replay proved no exception. The rotation setting is also unrealistic. "Strict order" would give far too many starts to the bottom half of the staff. As noted above, two pitchers accounted for half of all the team's starts. I suppose Johnson and Groom would duplicate that result if the club had a four-man rotation that was set to a strict order, but those settings would yield grossly unrealistic totals for the rest of the staff. As I mentioned, there was some turnover in personnel as the club sought to bolster the pitching staff down the pennant stretch. As a result, I decided to play with historical transactions on. There's a trade-off here. On one hand, using historical transactions should have helped distribue innings among the staff on a more historical basis. On the other hand, having historical transactions on meant that no games were cancelled or postponed because of weather and there were no random injuries. As I've mentioned in my previous threads, one reason why starters at the bottom of a team's staff would get the occasional start was because of the large number of double-headers that teams played in that era, and those double-headers, in turn, were largely the result of rainouts. That's a trade-off I was willing to make, given that my primary aim was to see how the game handled pitchers' rest. My 1977 replay revealed that the game wasn't forcing starters to take an extra day of rest. That, however, was with a league stamina setting of "normal" and pitchers with high stamina needing four days of rest between starts. How would the game handle pitchers with high stamina who only needed three days of rest? The answer: the same as before. It quickly became apparent to me that OOTP was still making starting assignments the day before the game. Walter Johnson, an absolute beast who had an 80 stamina and who completed 32 of his 36 starts in my replay, recovered from starts more-or-less according to the following schedule: Day 0 (day of game): 0% Day 1: 15% Day 2: 45% Day 3: 70% Day 4: 100% The other regular starters had similarly high stamina ratings and recovered at about the same rate (the biggest difference was on the first day after a start). As can be seen, Johnson could easily start with three days of rest, even if he threw over 125 pitches in a game. That didn't happen. Instead, the game was still forcing the Washington starters to take an extra day of rest between starts, which yielded the same result that I had seen in my previous replays: the top half of the rotation started too few games and the bottom half started too many. Here is the breakdown showing the difference between the real life (RL) stats and the stats from my replay (sim), with pitchers listed in roughly the order they appeared in the rotation (* denotes that the player was on the team for only part of the season): Walter Johnson (RL) 50 G, 37 GS, 369.0 IP (sim) 36 G, 36 GS, 324.2 IP Bob Groom (RL) 43 G, 40 GS, 316.0 IP (sim) 34 G, 34 GS, 288.1 IP Tom Hughes (RL) 31 G, 26 GS, 196.0 IP (sim) 33 G, 30 GS, 269.0 IP Carl Cashion (RL) 26 G, 17 GS, 170.1 IP (sim) 31 G, 25 GS, 220.0 IP Joe Engel* (RL) 17 G, 10 GS, 75.0 IP (sim) 10 G, 10 GS, 67.1 IP Hippo Vaughn* (RL) 12 G, 8 GS, 81.0 IP (sim) 10 G, 9 GS, 83.0 IP Dixie Walker* (RL) 9 G, 8 GS, 60.0 IP (sim) 8 G, 5 GS, 55.1 IP Jerry Akers* (RL) 5 G, 1 GS, 20.1 IP (sim) 3 G, 3 GS, 14.2 IP Barney Pelty* (RL) 11 G, 4 GS, 43.2 IP (sim) 13 G, 1 GS, 30.2 IP Paul Musser* (RL) 7 G, 2 GS, 20.2 IP (sim) 5 G, 1 GS, 14.0 IP I had the bench coach set the staff. Johnson, Groom, and Hughes were the top three starters from beginning to end. Cashion and Engel both fluctuated between the fourth and fifth spots, depending on whether someone better was around. For instance, when the team acquired Vaughn in June, Cashion and Engel both moved down a rung on the ladder to fifth and sixth respectively. So the fourth spot was basically Cashion-Vaughn-Engel, while the fifth spot was take-your-pick Cashion-Engel-Pelty-Walker-Musser. There were several times during the season when I was down to a five-man or even four-man staff, but there was never an occasion when the game had to start someone who wasn't fully rested. Undoubtedly that wouldn't have been the case if I was playing with rainouts and injuries enabled, but, like I said, that was a conscious decision on my part. Still, the results were in line with my previous replays. The top two starters combined for seven fewer starts than they did in real life while the next three starters (Hughes, Cashion, Vaughn) gained thirteen. Cashion, who spent a large part of his time as the number 5 starter, was the biggest gainer with eight more starts. continued.... __________________ American-Ethnic (and Canadian) Namesets Historical Minor League Schedules 1870s City/Team Nickname Randomizers "It's Usually Sunny in Philadelphia" weather mod Negro League Schedules

09-22-2020, 10:15 PM	#3
joefromchicago Hall Of Famer Join Date: Jun 2011 Posts: 3,726	1912 SENATORS REPLAY (continued) How did this play out during the season? The following chart shows the starting assignments in my replay: 1 - Johnson 2 - Groom 3 - Hughes 4 - Cashion 5 - Vaughn 6 - any other starter 0 - off-day (xx) - double-header APRIL - - - - 1 2 3 0 4 1 2 3 6 1 0 2 3 6 1 6 2 0 3 1 MAY - - - 4 2 6 3 0 1 2 4 3 6 1 0 2 3 4 1 6 2 0 3 1 6 2 6 3 0 1 2 4 (36) 0 JUNE - - - - - - 1 2 4 6 1 2 3 4 5 1 2 3 4 6 1 2 0 3 6 1 2 4 0 3 1 2 5 4 3 0 JULY - 1 2 5 (34) 6 1 0 0 2 3 1 5 4 0 2 1 3 5 3 2 0 1 3 4 2 0 1 3 5 2 6 AUGUST - - - - 1 3 5 2 4 1 3 5 2 4 1 0 3 2 5 1 4 0 2 3 1 5 4 2 0 1 3 4 2 6 1 SEPTEMBER 0 (34) 2 6 1 3 4 0 0 1 2 3 4 6 1 0 2 4 6 1 6 2 0 0 1 3 2 4 0 1 OCTOBER - - 3 0 2 4 1 Focusing on Johnson's starts, there's only one instance where he started with three days of rest. That was between his second and third starts in April. I think that was because the team used a couple of starters in relief during that span, so there may have been some rest issues with the fifth and sixth starters. But, as I mentioned, Johnson was fully rested when he started with three days of rest. The only reason that we should have expected him to take an extra day off is because of the way OOTP handles starters in this era. How, then, would the starts shake out if the game allowed the top starters to start with three days of rest? Calculating for Johnson, Groom, and Hughes, all of whom have 75-80 stamina ratings, the chart would look like this: 1 - Johnson 2 - Groom 3 - Hughes 4 - generic fourth starter 5 - generic fifth starter 0 - off-day (xx) - double-header APRIL - - - - 1 2 3 0 1 2 3 4 1 2 0 3 1 2 4 3 1 0 2 3 MAY - - - 1 4 2 3 0 1 2 3 4 1 2 0 3 1 2 4 3 1 0 2 1 3 4 2 1 0 3 2 1 (45) 0 JUNE - - - - - - 2 1 4 3 2 1 4 3 2 1 4 3 2 1 4 3 0 1 2 3 4 1 0 2 3 1 4 2 3 0 JULY - 1 2 3 (45) 1 2 0 0 1 2 3 4 1 0 2 3 1 4 2 3 0 1 2 3 4 0 1 2 3 4 1 AUGUST - - - - 2 3 4 1 2 3 4 1 2 3 4 0 1 2 3 4 1 0 2 3 1 4 2 3 0 1 2 3 4 1 2 SEPTEMBER 0 (34) 1 2 5 3 1 0 0 2 1 3 4 2 1 0 3 2 1 4 3 2 0 0 1 2 3 4 0 1 OCTOBER - - 2 0 1 3 2 This yields the following totals for game starts: Johnson - 43 Groom - 42 Hughes - 38 4th starter - 28 5th starter - 3 At first glance, this chart looks a lot like the four-man strict rotation that I rejected at the start of this test. That's true, but then that's because it's an artificial construct. It doesn't take into account real-life factors like injuries, rainouts/double-headers, and, perhaps most importantly, the use of starters as relievers. That last point merits some closer analysis.... __________________ American-Ethnic (and Canadian) Namesets Historical Minor League Schedules 1870s City/Team Nickname Randomizers "It's Usually Sunny in Philadelphia" weather mod Negro League Schedules Last edited by joefromchicago; 09-22-2020 at 10:48 PM.

09-22-2020, 10:20 PM	#4
joefromchicago Hall Of Famer Join Date: Jun 2011 Posts: 3,726	1912 SENATORS REPLAY (continued) As noted previously, Walter Johnson was used as a high-leverage reliever in addition to his starting duties. In my replay, however, Johnson never appeared in relief. His stamina, however, should allow him not only to start with three days of rest, but it should also allow him to pitch in relief occasionally, either when he's fully rested or when he is slightly tired. In 1912, for instance, the Big Train started four games when he had pitched the day before in relief. I don't think that's even possible in OOTP, but it should be. Nevertheless, Johnson's use as a reliever probably meant that he lost about three or four starts over the season, and those starts went to pitchers lower in the rotation like Akers and Engel. The schedule and the way pitchers were used in dual roles, therefore, had a way of spreading out starts among the staff that can't be duplicated using a four- or five-man rotation. In my replay, not only did Johnson not pitch in relief, but the other top starters (Groom, Hughes, and Cashion) combined for only nine relief appearances. Compare that to their real-life counterparts who together made forty trips from the bullpen in 1912. Instead of filling in as relievers in my replay, those pitchers were busy throwing complete games. That tracks my previous experience in which pitcher stamina paradoxically proved to be both too high and too low. Too high because pitchers were completing too many games and too low because the top starters weren't starting as often as they should. My Senators completed 111 games - an impressive 72% completion rate. The real-life Senators managed only 98 CGs (64%), which itself was higher than the MLB rate of 58% (which translates to 89.5 CGs). Part of the problem may be the game's pinch-hitter setting. The out-of-the-box setting for "pinch hit for pitchers" is "very rarely." I'm not sure if that's realistic. Pinch-hitting stats are not broken out for 1912, so I looked at the box scores for each of the 56 games where the Senators used two or more pitchers. Of those, manager Clark Griffith replaced a pitcher with either a pinch-hitter or a pinch-runner in 30 games (including Walter Johnson, who pinch-hit three times - that guy was a machine!). Now, in some of those games the replacement came in the Senators' final at-bat, so pinch-hitting for the pitcher didn't result in a pitching change. Still, this gives some sense as to how often a team in this era might use pinch-hitters. By modern standards, Washington definitely used pinch-hitters sparingly. Part of the reason for this, however, may simply be due to the fact that the club didn't have a very deep bench. One of the advantages to using historical transactions is that it gives a pretty accurate picture of how many players were on a team's roster on any given day. My league settings gave each team a 20-man roster, but Washington rarely suited up 20 players for a game. Usually it was 18 or 19, and sometimes it dipped below 17. That not only restricted the availability of pinch-hitters, but I suspect that the AI kept starters in the game because it didn't think that there were a lot of pitchers available to relieve either. From a modern perspective, that makes sense. But at a time when pitchers were expected to start and relieve, the AI should deem every pitcher on the bench to be potentially available as a reliever. A starter with an 80 stamina who is "slightly tired" is likely still a better choice to relieve than a has-been with a 35 stamina who has been relegated to the bullpen because he's too flammable to start. That the AI always prefers the has-been is reflective of the AI's modern-day mindset. If the AI thought more like Clark Griffith and less like Joe Madden, it might make sense to boost the pinch-hitter setting from "very rarely" to "rarely." That, at least, might go a ways toward shaving a few complete games off the team's totals. Another point that needs mentioning is the "Hal Dues" problem that I brought up in connection with my 1977 Expos replay. As I noted, OOTP seems to penalize starters who didn't log a lot of innings by giving them low stamina ratings, even though this has the effect of penalizing them twice for being, in effect, bad pitchers. This problem is even worse in the pre-reliever era, where just about every pitcher was expected to be a starter. A good example of this is Paul Musser, who appeared in seven games for Washington in 1912, twice as a starter. His first appearance and start was on June 6. He pitched well for five innings but got into a bases-loaded jam in the sixth and Griffith brought in Johnson to finish the game. The Washington Evening Star gushed over his performance, saying that "Musser proved beyond question that he is entitled to be worked regularly from this out, and that he will win a majority of his games." That proved overly optimistic, as Musser was roughed up in his next start and Griffith exiled him thereafter to the bullpen. Musser, it's true, wasn't very good, but the point here is that the Senators expected him to be a starter. As such, he should have the stamina to be a regular member of the rotation. Indeed, Musser had a long career as a starter in the minor leagues. OOTP, however, gives him a 35 stamina. But Musser didn't have low stamina - at least not in the eyes of his contemporaries. He just couldn't get the ball consistently over the plate, giving up 16 hits and 16 (!) walks in 20.2 innings. In short, he wasn't tired, he was just lousy. The problem here is two-fold: not only does OOTP penalize pitchers like Musser twice for being bad, but giving low stamina ratings to these types of pitchers creates an unrealistic split between starters and relievers that simply didn't exist in the pre-reliever era. The AI looks at a guy like Musser and his 35 stamina and says "he's only good as a reliever, I'll put him into the bullpen." Then the AI proceeds to use that reliever like it's 2020 rather than 1912. Far better to give a guy like Musser a decent stamina so that he can fill in, when necessary, as either a starer or reliever - just as he did in real life. __________________ American-Ethnic (and Canadian) Namesets Historical Minor League Schedules 1870s City/Team Nickname Randomizers "It's Usually Sunny in Philadelphia" weather mod Negro League Schedules

09-25-2020, 06:13 PM	#10
Bub13 All Star Reserve Join Date: Apr 2014 Location: Maine Posts: 748	Very interesting, JfC. I don't usually play historical, but if I do, this is food for thought. Also: the "Hal Dues Problem" made me go look up his career stats, as I somehow remembered him having a solid "2-8" rating in Statis-Pro back in '78. And sure enough ... a 2.36 ERA in 99 IP for Mr. Dues. Where I pulled that info from I'll never know... __________________ Introducing Your Hawaii Islanders!

09-23-2020, 05:08 AM	#5
luckymann Hall Of Famer Join Date: Nov 2019 Posts: 14,214	Fascinating stuff, brilliant! I'm currently doing my first "historical" sim, with the quotes invoked because it is a random debut league. There is certainly some strange paradoxes I am seeing with regard to each of the things you mention. Starters will often go 10-15 innings despite there being a bullpen full of relievers, most of whom see very little action throughout the course of a season. And at the start of each season I have to go through and edit injury proneness settings to reduce fragility among SP, of which close enough to 50% are given a FRAGILE rating. We fall into the trap of thinking that just because OOTP is such a fantastic game it should / can do EVERYTHING perfectly. That is unrealistic. Still, it would be nice to see some improvements toward a more accurate while still workable replication of IRL scenarios.

09-24-2020, 02:07 AM	#7
luckymann Hall Of Famer Join Date: Nov 2019 Posts: 14,214	Joe, any chance you can re-up Part 1 of this series, as the link seems to be broken, perhaps because it was on the old board?