Potential Issue with reversed RHB/LHB pefromance against RHP

thegoldengod · 04-02-2021, 01:40 AM

When doing a bunch of statistical analysis looking at ratings and impact on the game engine, I noticed that something that is thought to be conventional wisdom is out of whack when it comes to the batter vs pitcher match-up and what side of the plate the batter is on. The conventional wisdom is that opposite side of the plate matchups benefit the hitter since they can see the ball come out of the pitcher's hand better. However, when looking at 22, that does not appear to be the case at the moment.

I fired up an MLB quickstart in both build 30 and 34. I retired every LHP in the world, and ran the AI on every team. Every team was set to play in the same ballpark, with all park factors set to 1.000. So there's only RHP in this entire world, and park factors are not an issue. I then ran a ton of simulation module stuff (every team versus every other team in the league for 2,430 games), with some of these players getting more than 70,000 games in this test.

When I do some linear regression on wOBA to hitting stats, the side of the plate the batter is on seems reversed to what would be expected. Mainly LHB perform worse against RHB against RHP when all other things are controlled.

This screenshot contains the coefficients for each variable in the model, with bats_1 being RHB and bats_2 being LHB. All things considered, LHB would have a wOBA eight points lower compared to a RHB against RHP.

I thought I had messed up the test, and created a new MLB quickstart and tried it again, only to see the same results.

I thought this might be a sim module thing, so I used that same world, and simed an entire season using the game engine (not sure if Sim Module uses the same engine), and the same results appeared. When looking at OPS, RHB would expect to have a 30 point bump to OPS compared to a LHB when all their ratings are the same.

I haven't looked at LHPs and whether RHB would get a boost or suffer a penalty in that scenario. But I thought I would bring it up.

Lukas Berger · 04-02-2021, 02:13 AM

Doing anything as unexpected as retiring all the pitchers of a specific handedness is very likely to throw the results you're getting off, as the engine isn't really designed to deal with that sort of situation.

So I'm not really sure if there's too much we can take from this.

I'm sure Matt will see this and might take a look if he sees anything that triggers any alarms for him, but I suspect this is likely just a byproduct of the testing environment.

thegoldengod · 04-02-2021, 02:19 AM

Quote:

Originally Posted by Lukas Berger

Doing anything as unexpected as retiring all the pitchers of a specific handedness is very likely to throw the results you're getting off, as the engine isn't really designed to deal with that sort of situation.

So I'm not really sure if there's too much we can take from this.

I'm sure Matt will see this and might take a look if he sees anything that triggers any alarms for him, but I suspect this is likely just a byproduct of the testing environment.

This was not how things worked in 21, and as an additional measure I simply changed every pitcher that was left-handed to right-handed, and the same thing happened.

Lukas Berger · 04-02-2021, 02:35 AM

Quote:

Originally Posted by thegoldengod

This was not how things worked in 21, and as an additional measure I simply changed every pitcher that was left-handed to right-handed, and the same thing happened.

Ok, thanks. Appreciate the clarification.

thegoldengod · 04-02-2021, 02:43 AM

Just to confirm I'm not crazy. Created an MLB quickstart. Changed that park factors to 1.000 and did nothing to what hand the pitchers throw. Disabled injuries, position player fatigue, and disabled suspensions. Ran computer AI at the start, and then disabled the AI from making any changes to rosters.

Simulated the entire season in the engine (not sim module). Pulled the splits versus RHP and compared them to ratings in the database and LHB are still expected to perform worse against RHP than RHB.

thegoldengod · 04-02-2021, 03:01 AM

Quote:

Originally Posted by thegoldengod

Just to confirm I'm not crazy. Created an MLB quickstart. Changed that park factors to 1.000 and did nothing to what hand the pitchers throw. Disabled injuries, position player fatigue, and disabled suspensions. Ran computer AI at the start, and then disabled the AI from making any changes to rosters.

Simulated the entire season in the engine (not sim module). Pulled the splits versus RHP and compared them to ratings in the database and LHB are still expected to perform worse against RHP than RHB.

Same test as this, except looking at LHP splits and the same thing holds true: opposite side of the plate hitters are expected to perform worse.

Took the same league, same simulation. Looked at LHP splits and the "vsl" hitting ratings in the database. Ran some regressions against it, and LHB perform better against LHP than RHB do.

bats_1 = RHB
bats_2 = LHB

thegoldengod · 04-05-2021, 03:43 PM

I uploaded a jupyter notebook walking through the steps to reach the conclusion that I have posted here. Unless someone can show me otherwise, I am convinced that lefty/right matchups, and vice versa, are not working as expected in OOTP22.

austin101123 · 04-05-2021, 04:51 PM

I think the difference in performance is dictated by the ratings, not the handedness. The handedness effects ratings. It looks like he didn't give everyone the same ratings, and the distribution of ratings with assumption there are no interaction terms (there should be), make the results not that useful. Multicolinearity wouldn't matter much here I think, but that should also be checked for.

Checking the interaction of pitcher and batter ratings would be important too, and I don't think they are all linear either. I want to see some residuals. The difference between 40 and 65 control will reduce walks a lot more than the difference between 65 and 90 will, for example. How does pitcher control and batter eye interact with walk rate? And consider that the pitcher control is probably not linear related to batter performance on its own, either.

thegoldengod · 04-05-2021, 05:42 PM

Quote:

Originally Posted by austin101123

I think the difference in performance is dictated by the ratings, not the handedness. The handedness effects ratings.

Of course ratings will dictate the performance of a player, but they aren't the only thing. It's important to remember the pitcher also has ratings against both RHB and LHB. Just like batters will tend to have better splits against one hand than the other, pitchers do to. The side of the plate they bat from is important to judge what impact those pitcher ratings have on a hitter's expected outcomes.

Quote:

Originally Posted by austin101123

It looks like he didn't give everyone the same ratings, and the distribution of ratings with assumption there are no interaction terms (there should be), make the results not that useful. Multicolinearity wouldn't matter much here I think, but that should also be checked for.

You're going to have to explain why it would be important to give everyone the same ratings when you're trying to test this. The whole point of linear regression is to extract the importance of each variable within the model.

Quote:

Originally Posted by austin101123

Checking the interaction of pitcher and batter ratings would be important too, and I don't think they are all linear either.

I am not trying to model the relative impact of a single rating on a the rate a single event happens. Yes, there are some events and ratings that are not linear (Eye's impact on Walk Rate, Power's impact on HR Rate, Avoid Ks impact on K Rate). This is not that study.

Quote:

Originally Posted by austin101123

I want to see some residuals.

I went back and added residual plots and r2 scores for each model that was produced. The r2 scores for tests 1 and 4 are all higher than .90, where the scores for tests 2 and 3 are much lower, largely because it was a single season with significantly lower number of plate appearances for every player make the outcomes much more varied.

Quote:

Originally Posted by austin101123

The difference between 40 and 65 control will reduce walks a lot more than the difference between 65 and 90 will, for example. How does pitcher control and batter eye interact with walk rate? And consider that the pitcher control is probably not linear related to batter performance on its own, either.

What does have anything to do with anything I have been writing about? The first and fourth tests had every batter in MLB playing the same amount of games against the same pool of pitchers in MLB. A pitcher's ratings are controlled for in such an experiment. Obviously running a single season in tests two and three would not as strict controls, but as long as every pitcher is pitching from the same side the pitcher's ratings don't matter. Every model from 22 shows that opposite side of the plate batters suffer a penalty. Going back to 21 confirms this, because it shows THE EXACT OPPOSITE of what is happening in 22. So either 22 is right and the engine is behaving as expected and 21 was wrong and it just got fixed, or 22 is not behaving as expected.

austin101123 · 04-05-2021, 09:29 PM

> You're going to have to explain why it would be important to give everyone the same ratings when you're trying to test this. The whole point of linear regression is to extract the importance of each variable within the model.

> Yes, there are some events and ratings that are not linear

If the distribution of batting ratings isn't the same for LHBs and RHBs (and I don't see any reason to assume they have the same distribution), then you need to consider nonlinearity as well as interaction terms to say if LHB or RHB is actually overperforming or not.

thegoldengod · 04-05-2021, 09:49 PM

Quote:

Originally Posted by austin101123

> You're going to have to explain why it would be important to give everyone the same ratings when you're trying to test this. The whole point of linear regression is to extract the importance of each variable within the model.

> Yes, there are some events and ratings that are not linear

If the distribution of batting ratings isn't the same for LHBs and RHBs (and I don't see any reason to assume they have the same distribution), then you need to consider nonlinearity as well as interaction terms to say if LHB or RHB is actually overperforming or not.

Again, individual events like walk rate, home run rate and strikeout rate are non-linear. However, for those events it's just better to create a low and high model based on whether the player in question has a low/high rating in whatever tool mainly governs that event. I am not attempting to predict any of those. Simple linear regression is more than adequate for what I am attempting to do.

Also, if 22 is working as intended, why is 21 showing the exact opposite results?

thegoldengod · 04-05-2021, 10:32 PM

Updated the notebook to include t-tests and histograms for the two ratings that were shown to not have the same distribution.

Matt Arnold · 04-05-2021, 10:37 PM

One thing related to this to keep in mind is that the game tries for the right balance as a whole. Without any adjustments, then things get double-penalized, since most RHB are rated better vs L than vs R, and similarly for pitchers. We did make sure that the league balance came out closer to real balance.

So, depending on what you are looking at, yeah, you might see cases which may end up being backwards when you at look at it one way, but without that then the league devolves too much into platoons.

thegoldengod · 04-05-2021, 10:55 PM

Quote:

Originally Posted by Matt Arnold

One thing related to this to keep in mind is that the game tries for the right balance as a whole. Without any adjustments, then things get double-penalized, since most RHB are rated better vs L than vs R, and similarly for pitchers. We did make sure that the league balance came out closer to real balance.

So, depending on what you are looking at, yeah, you might see cases which may end up being backwards when you at look at it one way, but without that then the league devolves too much into platoons.

Matt, are you saying the dev team looked at the code and it's working as intended?

Matt Arnold · 04-05-2021, 11:00 PM

Quote:

Originally Posted by thegoldengod

Matt, are you saying the dev team looked at the code and it's working as intended?

Yes

thegoldengod · 04-05-2021, 11:15 PM

Quote:

Originally Posted by Matt Arnold

Yes

Appreciate the effort to look into it. I brought up MLB splits from 2020, and it appears that indeed LHB either perform a few points worse, or about even as RHB against RHP, and suffer heavily against LHP. So if the desire was to replicate last year, that seems to be the case then.

austin101123 · 04-05-2021, 11:16 PM

Since you are getting different results in ootp21 than ootp22, then there is something that changed and I imagine the findings are good enough without considering nonlinearity and interaction of terms

Quote:

Originally Posted by Matt Arnold

One thing related to this to keep in mind is that the game tries for the right balance as a whole. Without any adjustments, then things get double-penalized, since most RHB are rated better vs L than vs R, and similarly for pitchers. We did make sure that the league balance came out closer to real balance.

So, depending on what you are looking at, yeah, you might see cases which may end up being backwards when you at look at it one way, but without that then the league devolves too much into platoons.

I made a post about this problem last year, and was told it was getting worked on. I checked performance which seemed to be fixed some but not all the way in terms of performance, and I didn't look at ratings. I may look into this again.

Matt Arnold · 04-06-2021, 12:00 AM

Quote:

Originally Posted by thegoldengod

Appreciate the effort to look into it. I brought up MLB splits from 2020, and it appears that indeed LHB either perform a few points worse, or about even as RHB against RHP, and suffer heavily against LHP. So if the desire was to replicate last year, that seems to be the case then.

Again, keep in mind MLB stats are for the universe, whereas if you are running regressions, that digs to a different set. The average LHB contact vs R generally is higher than RHB vs R, as is the average LHP stuff vs L compared to RHP stuff vs L, which is why the balance can shift. So the change was mostly about balancing that and not double-penalizing a certain overall talent level.

thegoldengod · 04-06-2021, 12:26 AM

Quote:

Originally Posted by Matt Arnold

Again, keep in mind MLB stats are for the universe, whereas if you are running regressions, that digs to a different set. The average LHB contact vs R generally is higher than RHB vs R, as is the average LHP stuff vs L compared to RHP stuff vs L, which is why the balance can shift. So the change was mostly about balancing that and not double-penalizing a certain overall talent level.

I get that. This test controls for every pitcher rating and gets into that difference between RHB and LHB split for pitchers that you mention. That's the whole point of looking at bats_1, and bats_2 in the models I posted.

What I am saying is that a LHB has a .010 "hole" to dig out of simply because they're a LHB against RHP when you look at wOBA. That's the equivalent to about 30 points in BABIP, 18 points in Power, and and 30 points in Eye against RHP (1 - 200 scale). A LHB could have the exact same ratings as a RHB against RHP, and would be in that .010 hole against a RHB based on the data from the engine.

If the desire was to reduce that double-penalty, what I've been able to gather suggests you've gone too far in the other direction. I posted the data from the first test directly to the github for anyone to look at and do with as they please. But if the desired result is being achieved, it is what it is.

04-02-2021, 01:40 AM	#1
thegoldengod Minors (Double A) Join Date: Sep 2018 Posts: 110	Potential Issue with reversed RHB/LHB pefromance against RHP When doing a bunch of statistical analysis looking at ratings and impact on the game engine, I noticed that something that is thought to be conventional wisdom is out of whack when it comes to the batter vs pitcher match-up and what side of the plate the batter is on. The conventional wisdom is that opposite side of the plate matchups benefit the hitter since they can see the ball come out of the pitcher's hand better. However, when looking at 22, that does not appear to be the case at the moment. I fired up an MLB quickstart in both build 30 and 34. I retired every LHP in the world, and ran the AI on every team. Every team was set to play in the same ballpark, with all park factors set to 1.000. So there's only RHP in this entire world, and park factors are not an issue. I then ran a ton of simulation module stuff (every team versus every other team in the league for 2,430 games), with some of these players getting more than 70,000 games in this test. When I do some linear regression on wOBA to hitting stats, the side of the plate the batter is on seems reversed to what would be expected. Mainly LHB perform worse against RHB against RHP when all other things are controlled. This screenshot contains the coefficients for each variable in the model, with bats_1 being RHB and bats_2 being LHB. All things considered, LHB would have a wOBA eight points lower compared to a RHB against RHP. I thought I had messed up the test, and created a new MLB quickstart and tried it again, only to see the same results. I thought this might be a sim module thing, so I used that same world, and simed an entire season using the game engine (not sure if Sim Module uses the same engine), and the same results appeared. When looking at OPS, RHB would expect to have a 30 point bump to OPS compared to a LHB when all their ratings are the same. I haven't looked at LHPs and whether RHB would get a boost or suffer a penalty in that scenario. But I thought I would bring it up.

04-02-2021, 02:13 AM	#2
Lukas Berger OOTP Developments Join Date: Aug 2007 Location: Nice, Côte d'Azur, France Posts: 21,473	Doing anything as unexpected as retiring all the pitchers of a specific handedness is very likely to throw the results you're getting off, as the engine isn't really designed to deal with that sort of situation. So I'm not really sure if there's too much we can take from this. I'm sure Matt will see this and might take a look if he sees anything that triggers any alarms for him, but I suspect this is likely just a byproduct of the testing environment. __________________ lukas@ootpdevelopments.com PreOrder Out of the Park Baseball 26! Need to upload files for us to check out? Instructions can be found here

04-02-2021, 02:43 AM	#5
thegoldengod Minors (Double A) Join Date: Sep 2018 Posts: 110	Just to confirm I'm not crazy. Created an MLB quickstart. Changed that park factors to 1.000 and did nothing to what hand the pitchers throw. Disabled injuries, position player fatigue, and disabled suspensions. Ran computer AI at the start, and then disabled the AI from making any changes to rosters. Simulated the entire season in the engine (not sim module). Pulled the splits versus RHP and compared them to ratings in the database and LHB are still expected to perform worse against RHP than RHB.

04-05-2021, 03:43 PM	#7
thegoldengod Minors (Double A) Join Date: Sep 2018 Posts: 110	I uploaded a jupyter notebook walking through the steps to reach the conclusion that I have posted here. Unless someone can show me otherwise, I am convinced that lefty/right matchups, and vice versa, are not working as expected in OOTP22.

04-05-2021, 04:51 PM	#8
austin101123 Minors (Rookie Ball) Join Date: Jul 2018 Posts: 35	I think the difference in performance is dictated by the ratings, not the handedness. The handedness effects ratings. It looks like he didn't give everyone the same ratings, and the distribution of ratings with assumption there are no interaction terms (there should be), make the results not that useful. Multicolinearity wouldn't matter much here I think, but that should also be checked for. Checking the interaction of pitcher and batter ratings would be important too, and I don't think they are all linear either. I want to see some residuals. The difference between 40 and 65 control will reduce walks a lot more than the difference between 65 and 90 will, for example. How does pitcher control and batter eye interact with walk rate? And consider that the pitcher control is probably not linear related to batter performance on its own, either.

04-05-2021, 09:29 PM	#10
austin101123 Minors (Rookie Ball) Join Date: Jul 2018 Posts: 35	> You're going to have to explain why it would be important to give everyone the same ratings when you're trying to test this. The whole point of linear regression is to extract the importance of each variable within the model. > Yes, there are some events and ratings that are not linear If the distribution of batting ratings isn't the same for LHBs and RHBs (and I don't see any reason to assume they have the same distribution), then you need to consider nonlinearity as well as interaction terms to say if LHB or RHB is actually overperforming or not.

04-05-2021, 10:32 PM	#12
thegoldengod Minors (Double A) Join Date: Sep 2018 Posts: 110	Updated the notebook to include t-tests and histograms for the two ratings that were shown to not have the same distribution.

04-05-2021, 10:37 PM	#13
Matt Arnold OOTP Developer Join Date: Jun 2009 Location: Here and there Posts: 15,956	One thing related to this to keep in mind is that the game tries for the right balance as a whole. Without any adjustments, then things get double-penalized, since most RHB are rated better vs L than vs R, and similarly for pitchers. We did make sure that the league balance came out closer to real balance. So, depending on what you are looking at, yeah, you might see cases which may end up being backwards when you at look at it one way, but without that then the league devolves too much into platoons.