View Single Post
Old 03-16-2021, 12:08 AM   #5
scipper
Minors (Rookie Ball)
 
Join Date: Nov 2018
Posts: 24
Fielding
TL;DR Range is king. We’ll dive into more catcher effects in the next article (which will make catcher the most valuable). For now, SS and CF data matter a ton.

Defense is one of the hardest stats to figure out and find data for. The ideal thing we’d like to calculate is how many runs a player saved. There’s a problem - each hitter can be compared to other hitters. We should only compare each defender to other defenders at the same position. Using regular league pulls, this is hard to figure out. We can cheat it by using tournament data.

In tournaments, if a player has only 1 position with experience to start, and they have innings played in the field, they played at that position. Rarely, they’ll play at a position where they have no experience, but we'll have to ignore that. If we download tourney data. We can then analyze it like we do batters, We will only look at single-position players. Finding these, we can try to calculate defensive stats for each player. Important caveat: this requires a lot of tournament data, and of the same level. Mixing levels could throw off our averages. I recommend pulling 30-40 different individual open tourneys with no modifications (historical/live/…). Pull from around the same time (so the meta doesn’t change much). When you enter one of these tourneys, I would try to make sure each player you start is a 1 position player. Especially at positions like LF/RF/3B where people start multi-position players. Figuring out how to better categorize positions or a better way of pulling stats would give a major edge.

We use individual ratings because they’re worth more than the main defensive rating. Generally 1 point of range is worth more than 1 point of error rating. We want to figure out how that maps to run outcomes. While a player is training at a position any formula we generate won’t map to the expected outcome. There have been anecdotes of players slumping while training a new position. But, I don’t have any data to back it up.

Given this data, we can look at 2 things. The first is ZR per inning. ZR is good because it is already translated to runs saved and adjusted per position. If we know the number of runs per win, we can translate it to how many wins a player added to. The downside is that the way OOTP calculates ZR is unknown. They may not actually be using the best way to calculate runs saved. The other way is to look at the actual ball in zone data. OOTP tells us: 1. How many total balls passed through a defenders zone. 2. How hard it was to get to the ball (routine play, likely, even, unlikely, very unlikely, and impossible) 3. Whether they made it to the ball. Using these combined stats, we can calculate a play%. We can also figure out what % of times a ball entered in the zone will generate about how many outs. This is all straightforward so far. Unfortunately, there have been reports that range affects total balls in the zone. This means we can’t expect the same amount of balls in the zone per player - it depends on their range. By doing this play% gets thrown off because a higher range player could have more unlikely balls in zone. They could be converting “impossible” zone balls to “unlikely”. To calculate everything I went and calculated a normalized play%. After, I also calculated total balls in zone. Finally, I determined how many outs above average based on this. With these outs above average (per 162 games) we can translate this into WAR if we know outs per run.

No positional adjustment for WAR yet, so the raw data looks like RF/LF is worth more than CF. This is because the standards are higher for CF, so the replacement cost is higher. Generally we’d expect CF/SS to be the most important WAR-wise worth about 2-3 wins over competition. At high levels everyone is starting some of the best defensive players.

For now, we’ll ignore the effect of catchers on pitchers.

See updated code here, especially the calculate_defensive_stats.py file. There’s a lot of intricacies of logic here that you may decide to remove or not.

Catcher Defense
TL;DR Catchers have a massive effect on CERA. They’re the most important defensive position in the game.

In the previous article, we looked at how players affect balls in play. 7 out of the 9 positions have their only defensive effect here. Catchers do more than that, though. Catchers affect how well a pitcher pitches. We need to look at how well catchers turn hits into outs and walks into strikes. We do this by focusing on CERA. If we do a regression on CERA, in OOTP 21 (this changes per version), about 1 point of C ability is worth 0.01 CERA. Over a full season this translates to 100 Catcher ability is worth about 10 WAR! That’s an insane number. We can’t be perfect because the best teams play the best pitchers and best catchers together. They'll play teams who have neither. In general we can see how valuable this is.

The other main stat that catchers can affect is stolen bases. Not only by throwing runners out, but deterring runners from even attempting a stolen base in the first place. A higher catcher arm ability leads to a small increase in runners thrown out, but a large decrease in attempts. We can calculate how much a catcher “deters” stolen bases too.

Catchers value is defensive so remember to grab a strong defensive catcher first. See updated code calculating this here.
scipper is offline   Reply With Quote