Yeah, that's the problem I'm running into. But I'm sure you can get close estimates based on the statics. I'll give you one example I'm working on the 2002 Yankees. Although the line up was stacked 1-9, my biggest interest is Giambi and Soriano. Their respective AVG and HRs- Giambi: .314/41 - Soriano: .300/39 are for all intents and purposes nearly interchangeable. But Torre batted Soriano lead off in over 140 games and Giambi clean-up mostly. The only reasonable explanation for Soriano batting lead off was his speed; he led the AL in SBs with 41. Other than that he had no business batting lead-off. He walked 32 times. Giambi's OBP was literally over 100 points higher than Soriano's. That's a huge 10% difference. Added to that, Soriano had 60 more PA than Giambi, by virtue of leading off. If Giambi led off, he'd have those 60 extra at bats plus he would of made 60 fewer outs, probably giving him an addition 40 PAs. It would also give the 4 and 5 hitters 2 or 3 dozen additional PAs. Now since we already know every players HRs per PA and other important stats it would be easy to prorate those numbers to fit the higher PAs, plus the added production of having Giambi on base almost 100 more times than Soriano.
Batting Soriano 5th would still give you 40 HRs, so no lost production there.
There obviously would be no way to know how this would affect opposing pitchers' pitch count, if Giambi was capable of hitting lead-off and the million other variables... but I'm sure there is a way to simulate a season by plugging in the numbers. I remember Bill James ran simulations to see if intentionally walking Ruth the entire '21 season would of produced fewer or more runs. So I know its possible. I just don't know where to start.
I appreciate your input. and if you could pass off the idea to anyone interested and capable of such an experiment, I'd love to give it a go.
here's a link to the 2002 yanks' team page
https://www.baseball-reference.com/teams/NYY/2002.shtml