Home | Webstore
Latest News: OOTP 25 Available - FHM 11 Available - OOTP Go! Available

Out of the Park Baseball 25 Buy Now!

  

Go Back   OOTP Developments Forums > Out of the Park Baseball 25 > OOTP Mods > OOTP Mods - Database Tools
Register Blogs FAQ Calendar Today's Posts Search

OOTP Mods - Database Tools Do you need to take a dump? SQL gurus welcome

Reply
 
Thread Tools
Old 12-13-2021, 03:53 PM   #1
Syd Thrift
Hall Of Famer
 
Syd Thrift's Avatar
 
Join Date: May 2004
Posts: 10,430
A couple of quick MySQL tricks I thought I'd post

I'm realizing now that since I work with SQL a lot in my daily life, I kind of know my way about the syntax. So, I'd figure I'd use this space to come up with some reports that I plan on using and which you also might find come in handy.

Note that I'm using a MySQL DB called 'modern-ish-baseball'; you'd need to replace that with whatever you're calling your own DB to use this (and of course you'll need to have OOTP dump the relevant tables).

I'll start here with a report of a team's batters. The stats I included are your standard "baseball card" stats, no more, no less, and players are ordered by their primary position (i.e. the place they played the most at). As I'm running a temp table in the WHERE clause, this is not the most efficient query in the world but I'm running it against a 24 team league with around 25 years of history and it still only took about 2s to run.

The results:

Code:
abbr    first_name	last_name	g	ab	r	h	2B	3B	hr	rbi	bb	ibb	k	hpb	sh	sf	sb	cs	gdp	ba	    obp	    slg	    pos	g
NYY	    Jason	    Mooneyhan	116	435	54	114	18	1	10	61	29	2	34	2	0	11	0	0	17	0.262	0.929	0.377	2	114
NYY	    Khalil	    Tabb	    64	179	19	62	9	0	4	29	11	5	32	2	0	1	0	0	11	0.346	0.987	0.464	2	52
NYY	    Josh	    Paige	    3	3	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0.000	--  	0.000	2	2
NYY	    Justin	    Richens	    63	231	34	71	17	2	9	26	41	3	14	6	0	5	0	0	6	0.307	0.959	0.515	3	63
NYY	    Jose	    Ayala	    51	193	24	55	9	0	5	23	13	2	34	1	0	4	2	0	10	0.285	0.945	0.409	3	48
NYY	    John	    Chapman	    60	138	13	31	6	0	3	14	26	1	16	1	0	2	0	0	0	0.225	0.967	0.333	3	33
NYY	    Bobby	    Berg	    12	23	7	8	2	0	2	8	6	0	0	0	0	0	0	0	2	0.348	1.000	0.696	3	6
NYY	    Wing-fung	Yi	        105	383	44	117	17	6	6	45	67	5	29	2	0	3	12	15	4	0.305	0.984	0.428	4	103
NYY	    Elijah	    Patton	    38	92	10	19	4	0	0	12	9	3	13	0	0	2	0	1	5	0.207	0.933	0.250	4	17
NYY	    Chris	    Eckert	    16	33	4	7	1	0	0	3	4	2	7	0	0	0	0	0	2	0.212	1.000	0.242	4	7
NYY	    Nate	    Lieb	    3	9	1	2	0	0	1	2	2	1	3	0	0	0	0	0	0	0.222	1.000	0.556	4	3
NYY	    Tom	        Weiss	    128	423	62	115	12	2	17	69	60	3	38	7	1	3	0	0	8	0.272	0.978	0.430	5	108
NYY	    David	    Salinas	    49	169	14	36	5	0	1	18	15	0	12	1	0	2	2	2	8	0.213	0.963	0.260	5	45
NYY	    Ty	        Stover	    146	530	85	127	24	6	25	92	96	1	105	3	0	7	1	1	19	0.240	0.970	0.449	6	116
NYY	    Jason	    Wichert	    28	72	4	14	2	0	0	6	3	0	18	0	0	2	1	0	2	0.194	0.895	0.222	6	26
NYY	    Chris	    Grube	    10	28	0	1	0	0	0	1	2	1	1	0	0	0	0	0	0	0.036	1.000	0.036	6	9
NYY	    Ross	    Poynor	    118	389	51	115	13	4	8	39	24	4	74	0	3	2	9	12	4	0.296	0.965	0.411	7	62
NYY	    Dan	        Field	    69	287	32	85	17	0	5	30	23	4	24	3	0	0	0	0	5	0.296	1.000	0.408	7	68
NYY	    Ramiro	    Palencia	24	64	7	11	3	0	0	2	17	0	13	1	0	0	1	1	0	0.172	1.000	0.219	7	17
NYY	    Matt	    Brettell	54	52	5	12	2	0	2	6	2	0	15	0	1	0	0	0	0	0.231	0.933	0.385	7	6
NYY	    Willie	    Morales	    20	29	3	4	2	0	1	3	2	0	7	0	0	0	0	0	0	0.138	1.000	0.310	7	4
NYY	    Micah	    MacMillan	109	445	59	110	25	10	5	36	40	2	46	7	1	1	19	10	1	0.247	0.987	0.382	8	108
NYY	    Marc	    Ash	        62	178	21	44	8	1	1	14	19	0	45	0	0	1	4	5	2	0.247	0.984	0.320	8	33
NYY	    Frank	    Meneses	    117	428	63	95	9	8	12	45	81	2	95	4	0	1	11	7	6	0.222	0.994	0.364	9	111
NYY	    Eric	    Green	    33	74	7	14	2	0	1	8	2	0	14	2	0	0	1	0	1	0.189	1.000	0.257	9	16
NYY	    Arturo	    Ganzalez	8	27	2	6	0	0	0	1	0	0	2	0	0	0	2	0	1	0.222	1.000	0.222	9	5
And the SQL:

Code:
SET @year = 1969, @team_abbr='NYY';

SELECT tm.abbr,
	plr.first_name,
    plr.last_name,
    bat.g,
    bat.ab,
    bat.r,
    bat.h,
    bat.d as 2B,
    bat.t as 3B,
    bat.hr,
    bat.rbi,
    bat.bb,
    bat.ibb,
    bat.k,
    bat.hp as hpb,
    bat.sh,
    bat.sf,
    bat.sb,
    bat.cs,
    bat.gdp,
    round(bat.h / bat.ab, 3) as ba,
    round((bat.h + bat.bb + bat.hp) / (bat.h + bat.bb + bat.hp + bat.sh + bat.sf), 3) as obp,
    round((bat.h + bat.d + 2 * bat.t + 3 * bat.hr) / ab, 3) as slg,
    fld.position,
    fld.g
FROM `modern-ish-baseball`.teams tm 
	LEFT JOIN players_career_batting_stats bat
		ON bat.team_id = tm.team_id
	LEFT JOIN players plr
		ON plr.player_id = bat.player_id
    LEFT JOIN players_career_fielding_stats fld
		ON fld.player_id = plr.player_id and fld.year = @year and bat.team_id = fld.team_id
where tm.abbr = @team_abbr
	AND bat.year = @year
    AND bat.split_id = 1
    AND fld.ip = (select max(ip) from `modern-ish-baseball`.players_career_fielding_stats f
	WHERE f.year = @year and f.team_id = tm.team_id and f.player_id = plr.player_id)
    AND fld.position > 1
ORDER BY position, bat.pa desc
You could pretty easily drop this into a stored procedure and then just input the year and team abbreviation above to get back the relevant data. Also, when I run this myself I'll probably get rid of the team abbreviation from the list and maybe add handedness info.... the point is, MySQL gives you the wherewithal to do this.
__________________
Quote:
Originally Posted by Markus Heinsohn
You bastard....
The Great American Baseball Thrift Book - Like reading the Sporting News from back in the day, only with fake players. REAL LIFE DRAMA THOUGH maybe not
Syd Thrift is offline   Reply With Quote
Old 12-13-2021, 03:55 PM   #2
Syd Thrift
Hall Of Famer
 
Syd Thrift's Avatar
 
Join Date: May 2004
Posts: 10,430
Just a quick note on the formatting: I'm using the MySQL workbench to churn this out. I believe if you use the MySQL CLI (command line interpreter) it will push out a little text grid that you should be able to drop straight into a code block without formatting (or, you know, you do whatever if you want to use this to export to an API or a CSV file, etc.).
__________________
Quote:
Originally Posted by Markus Heinsohn
You bastard....
The Great American Baseball Thrift Book - Like reading the Sporting News from back in the day, only with fake players. REAL LIFE DRAMA THOUGH maybe not
Syd Thrift is offline   Reply With Quote
Old 12-13-2021, 04:23 PM   #3
Syd Thrift
Hall Of Famer
 
Syd Thrift's Avatar
 
Join Date: May 2004
Posts: 10,430
Here's the career totals of Henry Riggs, a guy who is basically Hank Aaron in my league, with the SQL I used to bring him back:

Code:
year	team	g	ab	r	h	2B	3B	hr	rbi	bb	ibb	k	hpb	sh	sf	sb	cs	gdp	ba	obp	slg
1954	ATL	70	243	23	66	10	1	4	29	21	2	28	1	0	4	1	0	6	0.272	0.327	0.370
1955	ATL	84	288	31	83	14	1	13	48	23	1	36	2	0	6	0	0	4	0.288	0.339	0.479
1956	ATL	125	454	70	142	21	4	20	64	58	2	32	0	0	4	1	0	13	0.313	0.388	0.509
1957	ATL	152	552	99	182	38	1	23	79	65	4	30	0	0	7	0	0	14	0.330	0.396	0.527
1958	ATL	151	570	98	192	29	2	27	94	77	3	42	0	0	5	0	0	10	0.337	0.413	0.537
1959	ATL	153	577	96	188	28	1	37	96	72	6	47	2	0	4	0	0	11	0.326	0.399	0.570
1960	ATL	140	499	79	173	25	2	24	72	74	12	20	0	0	3	1	1	9	0.347	0.429	0.549
1961	ATL	142	527	94	175	33	1	28	88	76	7	23	1	0	6	2	2	20	0.332	0.413	0.558
1962	ATL	148	540	103	188	26	3	33	92	76	7	21	1	0	4	1	0	12	0.348	0.427	0.591
1963	ATL	113	423	87	144	20	0	37	104	57	2	15	1	0	6	0	2	15	0.340	0.415	0.650
1964	ATL	85	314	63	108	17	1	16	41	39	2	20	1	0	1	0	1	12	0.344	0.417	0.557
1965	ATL	159	558	108	177	17	1	45	116	95	9	29	1	0	5	1	1	16	0.317	0.414	0.593
1966	ATL	157	573	111	191	21	0	44	119	88	9	38	0	0	4	1	4	14	0.333	0.420	0.600
1967	ATL	127	465	79	136	17	0	29	72	69	12	40	1	0	5	0	0	12	0.292	0.381	0.516
1968	ATL	157	563	86	149	22	4	25	77	74	6	56	3	0	6	1	0	15	0.265	0.350	0.451
1969	ATL	112	397	78	131	15	0	32	79	79	7	26	1	0	5	0	0	8	0.330	0.438	0.610
Totals		2075	7543	1305	2425	353	22	437	1270	1043	91	503	15	0	75	9	11	191	0.321	0.401	0.548
Code:
set @player_id = 7620;

SELECT 	
	bat.year,
    tm.abbr as team,
	bat.g,
    bat.ab,
    bat.r,
    bat.h,
    bat.d as 2B,
    bat.t as 3B,
    bat.hr,
    bat.rbi,
    bat.bb,
    bat.ibb,
    bat.k,
    bat.hp as hpb,
    bat.sh,
    bat.sf,
    bat.sb,
    bat.cs,
    bat.gdp,
    round(bat.h / bat.ab, 3) as ba,
    round((bat.h + bat.bb + bat.hp) / (bat.pa), 3) as obp,
    round((bat.h + bat.d + 2 * bat.t + 3 * bat.hr) / ab, 3) as slg
	FROM `modern-ish-baseball`.players_career_batting_stats bat
    LEFT JOIN teams tm
		on tm.team_id = bat.team_id
	where player_id = @player_id
		and tm.league_id = 100
		and split_id = 1
		
UNION
SELECT 
	"Totals",
	'',
	sum(bat.g),
    sum(bat.ab),
    sum(bat.r),
    sum(bat.h),
    sum(bat.d),
    sum(bat.t),
    sum(bat.hr),
    sum(bat.rbi),
    sum(bat.bb),
    sum(bat.ibb),
    sum(bat.k),
    sum(bat.hp),
    sum(bat.sh),
    sum(bat.sf),
    sum(bat.sb),
    sum(bat.cs),
    sum(bat.gdp),
    round((sum(bat.h)) / (sum(bat.ab)), 3),
    round((sum(bat.h) + sum(bat.bb) + sum(bat.hp)) / (sum(bat.pa)), 3) as obp,
    round((sum(bat.h) + sum(bat.d) + 2 * sum(bat.t) + 3 * sum(bat.hr)) / sum(bat.ab), 3) as slg
FROM `modern-ish-baseball`.players_career_batting_stats bat
    LEFT JOIN teams tm
		on tm.team_id = bat.team_id
	where player_id = @player_id
		and tm.league_id = 100
		and split_id = 1
GROUP BY player_id
The player_id value might be a little tricky to get; you'd need to look the player in question up in the players table and find their ID. This should list teams in the chronological order they played in since it's just going by the order they were added to the DB.
__________________
Quote:
Originally Posted by Markus Heinsohn
You bastard....
The Great American Baseball Thrift Book - Like reading the Sporting News from back in the day, only with fake players. REAL LIFE DRAMA THOUGH maybe not
Syd Thrift is offline   Reply With Quote
Old 12-13-2021, 04:59 PM   #4
Syd Thrift
Hall Of Famer
 
Syd Thrift's Avatar
 
Join Date: May 2004
Posts: 10,430
...and the team pitching breakdown, sorted by IP. As before I used the layout, more or less, of the Great American Baseball Stat Book from the 80s, although here I stuck decisions and saves in the front rather than the back. This team by the way is the Atlanta Braves, who just clinched the playoffs in my league for the first time in franchise history.

Code:
			w	l	Pct	Sv	era	g	gs	cg	sho	gf	ip	bfp	h	r	er	hr	sh	sf	hp	bb	ibb	so	wp	bk	Hper9	HRper9	BBer9	Kper9
George	House		21	8	0.724	0	2.70	34	34	11	3	0	256.2	1038	251	87	77	17	15	8	15	49	5	175	3	0	8.8	0.6	1.7	6.1
Trevon	Dean		21	5	0.808	0	3.58	31	31	9	5	0	233.2	959	207	101	93	15	8	7	10	81	2	112	7	0	8.0	0.6	3.1	4.3
Felix	Carranza	9	15	0.375	0	4.38	32	31	6	3	1	213.2	909	227	125	104	21	20	3	11	57	4	116	9	0	9.6	0.9	2.4	4.9
Kevin	Pennock		18	7	0.720	0	2.98	30	25	3	0	3	190.1	798	181	71	63	10	9	3	8	61	2	84	10	0	8.6	0.5	2.9	4.0
John	Winn		7	4	0.636	25	1.96	62	0	0	0	55	87.1	342	67	21	19	5	5	3	1	20	4	95	2	0	6.9	0.5	2.1	9.8
Jake	Cari		4	3	0.571	0	3.61	29	9	2	1	8	77.1	322	65	33	31	5	5	2	1	37	5	43	1	0	7.6	0.6	4.3	5.0
Seth	Cokely		3	2	0.600	0	3.83	15	5	2	0	5	51.2	220	52	24	22	4	4	1	0	25	1	12	1	0	9.1	0.7	4.4	2.1
Andres	Rivera		1	1	0.500	3	5.55	44	0	0	0	23	48.2	206	53	31	30	9	1	4	2	11	3	32	1	0	9.8	1.7	2.0	5.9
Chris	Wilson		2	4	0.333	0	5.63	12	4	0	0	2	38.1	163	42	24	24	7	2	3	0	18	0	20	0	0	9.9	1.6	4.2	4.7
Nate	Dolezal		3	2	0.600	0	6.10	11	3	0	0	5	31.0	137	30	28	21	5	2	3	1	16	2	22	1	0	8.7	1.5	4.6	6.4
Scott	Richey		0	1	0.000	0	5.60	20	0	0	0	8	27.1	123	34	19	17	2	1	3	0	12	0	14	1	1	11.2	0.7	4.0	4.6
Ethan	Fair		0	3	0.000	0	6.23	6	4	0	0	0	26.0	121	30	23	18	5	2	2	2	16	0	11	1	0	10.4	1.7	5.5	3.8
Tyler	Painter		1	0	1.000	1	3.97	14	0	0	0	6	22.2	111	23	15	10	1	0	2	1	23	0	17	0	0	9.1	0.4	9.1	6.8
Mickael	Diot		0	3	0.000	0	4.29	3	3	0	0	0	21.0	88	25	11	10	4	1	2	1	5	3	13	2	0	10.7	1.7	2.1	5.6
Roger	Evans		1	2	0.333	0	4.71	15	1	0	0	2	21.0	99	21	13	11	0	1	0	2	13	1	13	2	0	9.0	0.0	5.6	5.6
Tony	Morales		0	1	0.000	0	5.50	5	2	0	0	1	18.0	76	13	11	11	5	0	0	0	8	0	11	0	0	6.5	2.5	4.0	5.5
Paul	Hibbert		0	1	0.000	0	5.71	7	1	0	0	1	17.1	70	15	12	11	4	0	0	0	3	0	15	1	0	7.8	2.1	1.6	7.8
Mike	Lambert		0	1	0.000	0	7.50	1	1	0	0	0	6.0	30	9	5	5	0	0	1	0	4	0	1	1	0	13.5	0.0	6.0	1.5
Victor	Reyes		1	0	1.000	0	9.95	5	0	0	0	2	6.1	27	7	7	7	1	0	1	0	2	0	2	0	0	9.9	1.4	2.8	2.8
Damian	Seja		0	0		0	3.00	1	1	0	0	0	3.0	11	3	1	1	0	0	0	0	0	0	3	0	0	9.0	0.0	0.0	9.0
Carlos	Ortiz		0	0		0	0.00	1	0	0	0	0	2.0	7	0	0	0	0	0	0	0	0	0	0	0	0	0.0	0.0	0.0	0.0
Rick	Shattuck	0	0		0	0.00	1	0	0	0	0	0.2	2	1	0	0	0	0	0	0	0	0	0	0	0	13.5	0.0	0.0	0.0
Code:
SET @year = 1969, @team_abbr='ATL';

SELECT 
	plr.first_name,
    plr.last_name,
        pit.w,
    pit.l,
    round(pit.w / (pit.w + pit.l), 3) as Pct,
    pit.s as Sv,
    round((pit.er * 9) / (pit.ip + (pit.ipf / 3)), 2) as era,
	pit.g,
    pit.gs,
    pit.cg,
    pit.sho,
    pit.gf,
    concat(pit.ip, ".", pit.ipf) as ip,
    (pit.ab + pit.bb + pit.hp) as bfp,
    pit.ha as h,
    pit.r,
    pit.er,
    pit.hra as hr,
    pit.sh,
    pit.sf,
    pit.hp,
    pit.bb,
    pit.iw as ibb,
    pit.k as so,
    pit.wp,
    pit.bk,
    round(ha / (ip + ipf / 3) * 9,1) as Hper9,
    round(hra / (ip + ipf / 3) * 9,1) as HRper9,
    round(bb / (ip + ipf / 3) * 9,1) as BBer9,
    round(k / (ip + ipf / 3) * 9,1) as Kper9
    
FROM `modern-ish-baseball`.players_career_pitching_stats pit
LEFT JOIN players plr
	on plr.player_id = pit.player_id
LEFT JOIN teams tm
	on pit.team_id = tm.team_id
	WHERE year = @year
		and tm.abbr = @team_abbr
		and split_id = 1
order by pit.ip desc
I didn't have to get especially tricky with ordering or what have you so I did not. I guess the only slightly weird bit here is that there are two fields in the DB that comprise innings pitched: IP and "IPF", which stands for "innings pitched fraction". As baseball nomenclature goes, it's added onto the IP totals as {ip}.{ipf} and then it's treated as the third of an inning that it is in the calculations.

I know WHIP was super popular at around this time too so I considered adding it but... I never used it so I did not. The rates per 9 were not a part of the GABSB but I like them as a quick way to suss out peripherals so I added them, too.
__________________
Quote:
Originally Posted by Markus Heinsohn
You bastard....
The Great American Baseball Thrift Book - Like reading the Sporting News from back in the day, only with fake players. REAL LIFE DRAMA THOUGH maybe not
Syd Thrift is offline   Reply With Quote
Old 12-13-2021, 05:13 PM   #5
TC Dale
Global Moderator
 
TC Dale's Avatar
 
Join Date: May 2002
Posts: 4,837
Thanks for the SQL lesson, I figure there will be some folks that will enjoy it.
TC Dale is offline   Reply With Quote
Old 12-13-2021, 05:23 PM   #6
Syd Thrift
Hall Of Famer
 
Syd Thrift's Avatar
 
Join Date: May 2004
Posts: 10,430
Oh yeah, one thing I did consider adding in just to show how to do it was ERA-; however, the league is still in progress (although it's late September) and it doesn't look like the game passes down league totals until the season's over.
__________________
Quote:
Originally Posted by Markus Heinsohn
You bastard....
The Great American Baseball Thrift Book - Like reading the Sporting News from back in the day, only with fake players. REAL LIFE DRAMA THOUGH maybe not
Syd Thrift is offline   Reply With Quote
Old 12-13-2021, 05:29 PM   #7
Syd Thrift
Hall Of Famer
 
Syd Thrift's Avatar
 
Join Date: May 2004
Posts: 10,430
Here's a quick and dirty one for standings. Actually I'd love to put some more expanded data, like team runs scored and allowed, but as before this data won't get pushed into the DB until the season is over...

This is the American League West, home of more or less the only really close pennant race in my league (both NL teams have clinched and the Cleveland Indians lead the AL East by a relatively comfortable 5 games, although they still haven't quite clinched yet).

Code:
name			w	l	gb	pct
Minnesota Twins		88	66	0	0.5714
Chicago White Sox	86	68	2	0.5584
California Angels	78	76	10	0.5065
Kansas City Royals	67	87	21	0.4351
Seattle Pilots		64	90	24	0.4156
Oakland Athletics	47	107	41	0.3052
Code:
set @subleague = 0, @division=1;

SELECT concat(tm.name, ' ', tm.nickname) as name,
	tr.w,
    tr.l,
    tr.gb,
    tr.pct
	FROM `modern-ish-baseball`.teams tm
    LEFT JOIN team_record tr
		on tr.team_id = tm.team_id
	where
		tm.league_id = 100
		and tm.sub_league_id = @subleague
        and tm.division_id = @division
	order by gb
Here I didn't even need to do any calculations. There are ways to create a temp table with your list of subleagues and divisions and iterate over it to produce the standings but I don't really feel like doing that (haha) and TBH I think if you're designing an API or whatever the formatting is not as important as the data itself (for the latter, you could easily modify this to export the teams' subleague and division ids along with the names of those).

Oh, also, your league_id 100 team is going to be the first league you created. Unless you did something screwy with your league setup, you can probably get away with hardcoding that...
__________________
Quote:
Originally Posted by Markus Heinsohn
You bastard....
The Great American Baseball Thrift Book - Like reading the Sporting News from back in the day, only with fake players. REAL LIFE DRAMA THOUGH maybe not
Syd Thrift is offline   Reply With Quote
Old 12-13-2021, 05:30 PM   #8
Syd Thrift
Hall Of Famer
 
Syd Thrift's Avatar
 
Join Date: May 2004
Posts: 10,430
Quote:
Originally Posted by TC Dale View Post
Thanks for the SQL lesson, I figure there will be some folks that will enjoy it.
No problem! Honestly it's just SQL; people have been using this language for literally decades now and with a little bit of knowledge / desire to search Stack Overflow you can make it do a lot of things.
__________________
Quote:
Originally Posted by Markus Heinsohn
You bastard....
The Great American Baseball Thrift Book - Like reading the Sporting News from back in the day, only with fake players. REAL LIFE DRAMA THOUGH maybe not
Syd Thrift is offline   Reply With Quote
Old 12-14-2021, 06:47 PM   #9
Syd Thrift
Hall Of Famer
 
Syd Thrift's Avatar
 
Join Date: May 2004
Posts: 10,430
Okay, so next up are individual pitching career marks. I thought this would be harder than it wound up being, and had an idea that I was going to break things down a bit and try and demonstrate how you can develop around the limitations of SQL (the big one being, no real way to debug). However, I mathed the field I needed to math out (innings pitched) without having to break stuff down. I have an idea for breaking stuff apart next, however.

Anyway, this is Braves pitcher George House, who will at least be up for the NL Cy Young this season, along with the SQL:

Code:
ATL	1967	5	5	0.500	1	3.67	25	5	1	0	12	73.2	296	67	30	30	6	2	4	2	25	3	63	3	0	8.2	0.7	3.1	7.7
ATL	1968	16	7	0.696	0	1.99	31	31	5	0	0	231.0	899	178	62	51	18	11	8	8	49	5	184	9	0	6.9	0.7	1.9	7.2
ATL	1969	21	8	0.724	0	2.70	34	34	11	3	0	256.2	1038	251	87	77	17	15	8	15	49	5	175	3	0	8.8	0.6	1.7	6.1
Totals		42	20	0.677	1	2.53	90	70	17	3	12	561.1	2233	496	179	158	41	28	20	25	123	13	422	15	0	8.0	0.7	2.0	6.8
Code:
SET @player_id = 14787;

SELECT 
	tm.abbr,
    pit.year,
	pit.w,
    pit.l,
    round(pit.w / (pit.w + pit.l), 3) as Pct,
    pit.s as Sv,
    round((pit.er * 9) / (pit.ip + (pit.ipf / 3)), 2) as era,
	pit.g,
    pit.gs,
    pit.cg,
    pit.sho,
    pit.gf,
    concat(pit.ip, ".", pit.ipf) as ip,
    (pit.ab + pit.bb + pit.hp) as bfp,
    pit.ha as h,
    pit.r,
    pit.er,
    pit.hra as hr,
    pit.sh,
    pit.sf,
    pit.hp,
    pit.bb,
    pit.iw as ibb,
    pit.k as so,
    pit.wp,
    pit.bk,
    round(ha / (ip + ipf / 3) * 9,1) as Hper9,
    round(hra / (ip + ipf / 3) * 9,1) as HRper9,
    round(bb / (ip + ipf / 3) * 9,1) as BBer9,
    round(k / (ip + ipf / 3) * 9,1) as Kper9
    
FROM `modern-ish-baseball`.players_career_pitching_stats pit
LEFT JOIN players plr
	on plr.player_id = pit.player_id
LEFT JOIN teams tm
	on pit.team_id = tm.team_id
	WHERE pit.player_id = @player_id
		and split_id = 1
        and pit.league_id = 100
UNION
SELECT 
	'Totals',
    '',
	sum(pit.w),
    sum(pit.l),
    round(sum(pit.w) / (sum(pit.w) + sum(pit.l)), 3),
    sum(pit.s),
    round(sum(pit.er) / (sum(pit.ip) + sum(pit.ipf) / 3) * 9, 2), 
	sum(pit.g),
    sum(pit.gs),
    sum(pit.cg),
    sum(pit.sho),
    sum(pit.gf),
	concat(sum(ip) + floor(sum(ipf) / 3),'.', sum(ipf) % 3),
    sum(pit.ab) + sum(pit.bb) + sum(pit.hp),
    sum(pit.ha),
    sum(pit.r),
    sum(pit.er),
    sum(pit.hra),
    sum(pit.sh),
    sum(pit.sf),
    sum(pit.hp),
    sum(pit.bb),
    sum(pit.iw),
    sum(pit.k),
    sum(pit.wp),
    sum(pit.bk),
    round(sum(pit.ha) / (sum(pit.ip) + sum(pit.ipf) / 3) * 9,1),
    round(sum(pit.hra) / (sum(pit.ip) + sum(pit.ipf) / 3) * 9,1),
    round(sum(pit.bb) / (sum(pit.ip) + sum(pit.ipf) / 3) * 9,1),
    round(sum(pit.k) / (sum(pit.ip) + sum(pit.ipf) / 3) * 9,1)
    
FROM `modern-ish-baseball`.players_career_pitching_stats pit
LEFT JOIN players plr
	on plr.player_id = pit.player_id
LEFT JOIN teams tm
	on pit.team_id = tm.team_id
	WHERE pit.player_id = @player_id
		and split_id = 1
        and pit.league_id = 100
group by pit.player_id
The totals lines are basically figured the same as for hitters. By the way, if you wanted to add a "career highs" line, you can basically copy and paste the second half - everything from "union" on down - and replace sum() with max().

For innings pitched, if you'll remember from the team report we have two numbers: IP and IPF (fractions of an inning), and the way we lay out baseball IP is {inning}.{fraction}. This is easy enough to break down for one season but what about when you have a guy who's pitched 450 innings and 5 inning-fractions?

The math above show how but just to break it down....

Innings = Innings (duh) + fractions divided by 3, rounded down. In mySQL the floor() function takes a decimal number and just lops off everything after the decimal point (i.e. rounding down).

Fractions = the remainder of fractions divided by 3. In coding and I don't know, maybe in math too, we call the remainder the modulus and use the % key to figure it out. Also very useful for figuring average-and-over for you 19th century players!
__________________
Quote:
Originally Posted by Markus Heinsohn
You bastard....
The Great American Baseball Thrift Book - Like reading the Sporting News from back in the day, only with fake players. REAL LIFE DRAMA THOUGH maybe not
Syd Thrift is offline   Reply With Quote
Old 12-14-2021, 07:41 PM   #10
Syd Thrift
Hall Of Famer
 
Syd Thrift's Avatar
 
Join Date: May 2004
Posts: 10,430
All right, so with all that out of the way I wanted to try to get a little more complicated. First up, wouldn't it be cool if we could add a little asterisk next to all of the seasons where Henry Riggs made the All-Star Game? Okay though, how do we do this? It turns out, there is a league_history_all_star table that looks like it is pretty much exactly what you'd expect it to be. This time I'm going to list the SQL first before the results since I think reading through the SQL will make them make more sense:

Code:
SELECT * FROM `modern-ish-baseball`.league_history_all_star asg
	where asg.all_star = 7620;
7620, as you'll remember from before, is Henry Riggs' player_id. In this particular table, "all_star" is actually the player_id. Yell at Markus, not at me!

Code:
league	sub	year	aspos	all_star
100	1	1956	33	7620
100	1	1957	34	7620
100	1	1958	30	7620
100	1	1959	27	7620
100	1	1960	26	7620
100	1	1961	25	7620
100	1	1962	25	7620
100	1	1963	18	7620
100	1	1964	10	7620
100	1	1965	13	7620
100	1	1966	9	7620
100	1	1967	6	7620
100	1	1968	3	7620
These aren't the actual column names; I truncated a couple to make the layout make sense. Anyway, it looks like we do need to add a "league_id = 100" to our where clauses and I don't know what to do with that "all_star_position" field, but anyway, I do know that all we really need is the year.

Code:
SELECT 
	asg.year 
FROM `modern-ish-baseball`.league_history_all_star asg
where 
	asg.all_star = 7620
	and asg.league_id = 100;
Code:
year
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
As you can see, this is a pretty decent set. Riggs, who will certainly go into the Hall of Fame when his career is over, made the ASG every season from 1956 through 1968. He missed out this year because he was injured, though, and he also missed his first two seasons in the majors. So this works to test out because there will be several years where he'll need the asterisk but a couple where he won't. If he had nothing but ASGs in his career, you could accidentally write code that just always added an asterisk no matter what and... well, that would be pretty useless, wouldn't it?

In any case, this is the test subject I'm going to use. In real life programming you'd have several of these, you'd probably automate them (writing code that would evaluate the data you got back to confirm you're seeing what you're supposed to see), and then you'd have a QA department doing its own testing. But hey, for personal use, this should be fine...

All right, so this won't be the end result but just to test this out, let's tell our career batting report to only bring back years in which Riggs made the All Star Game... actually, no, first I'm going to get even simpler.

I feel like it's best when you're doing relatively complicated things like this to break things off into the smallest pieces you can, test the tiny increment you did, go ahead the next little increment test it to make sure that *that* works, and so on and so forth. This may sound kind of slow but trust me, this is way, way better than writing a whole bunch of code you think works only to find out that you made some very basic error in step 3 out of 50.

With SQL it's even worse because you can't really step through your code to figure this out, but I do the equivalent of this in "normal" code all the time. If anything it winds up speeding things up, in part because if you really get to thinking about breaking things down into the tiniest components you can, you free your brain to only having to have one small thing to think about instead of a much larger picture.

All of that is to say, let's bring back only... the year 1968. Let's just do one year for now; multiple years are a little tricky to hardcode, and we'll be able to bootstrap the final deal off of this anyway. I won't copy and paste the whole thing because y'all don't need to read all that but here's the relevant portion (there "where"):

Code:
	where player_id = @player_id
		and tm.league_id = 100
		and split_id = 1
		and bat.year in ( select  1968)
Code:
year	team	g	ab	r	h	2B	3B	hr	rbi	bb	ibb	k	hpb	sh	sf	sb	cs	gdp	ba	obp	slg
1968	ATL	157	563	86	149	22	4	25	77	74	6	56	3	0	6	1	0	15	0.265	0.350	0.451
Totals		2075	7543	1305	2425	353	22	437	1270	1043	91	503	15	0	75	9	11	191	0.321	0.401	0.548
The important part there is that bit where I'm saying "bat.year in (more code)". That's basically saying, "if you can find the year in this set, include it in the report". In this case we only have one hardcoded year. It's not very useful! But... what if we took that code we were just working on - that block inside of the parentheses just wants a select statement - and asked the report to use that instead?

Code:
	where player_id = @player_id
		and tm.league_id = 100
		and split_id = 1
		and bat.year in ( SELECT 
							asg.year 
							FROM `modern-ish-baseball`.league_history_all_star asg
							where 
								asg.all_star = @player_id
								and asg.league_id = 100
						)
Code:
year	team	g	ab	r	h	2B	3B	hr	rbi	bb	ibb	k	hpb	sh	sf	sb	cs	gdp	ba	obp	slg
1956	ATL	125	454	70	142	21	4	20	64	58	2	32	0	0	4	1	0	13	0.313	0.388	0.509
1957	ATL	152	552	99	182	38	1	23	79	65	4	30	0	0	7	0	0	14	0.330	0.396	0.527
1958	ATL	151	570	98	192	29	2	27	94	77	3	42	0	0	5	0	0	10	0.337	0.413	0.537
1959	ATL	153	577	96	188	28	1	37	96	72	6	47	2	0	4	0	0	11	0.326	0.399	0.570
1960	ATL	140	499	79	173	25	2	24	72	74	12	20	0	0	3	1	1	9	0.347	0.429	0.549
1961	ATL	142	527	94	175	33	1	28	88	76	7	23	1	0	6	2	2	20	0.332	0.413	0.558
1962	ATL	148	540	103	188	26	3	33	92	76	7	21	1	0	4	1	0	12	0.348	0.427	0.591
1963	ATL	113	423	87	144	20	0	37	104	57	2	15	1	0	6	0	2	15	0.340	0.415	0.650
1964	ATL	85	314	63	108	17	1	16	41	39	2	20	1	0	1	0	1	12	0.344	0.417	0.557
1965	ATL	159	558	108	177	17	1	45	116	95	9	29	1	0	5	1	1	16	0.317	0.414	0.593
1966	ATL	157	573	111	191	21	0	44	119	88	9	38	0	0	4	1	4	14	0.333	0.420	0.600
1967	ATL	127	465	79	136	17	0	29	72	69	12	40	1	0	5	0	0	12	0.292	0.381	0.516
1968	ATL	157	563	86	149	22	4	25	77	74	6	56	3	0	6	1	0	15	0.265	0.350	0.451
Totals		2075	7543	1305	2425	353	22	437	1270	1043	91	503	15	0	75	9	11	191	0.321	0.401	0.548
I hope you can see what I mean by bite-sized chunks now! Just that where clause is now saying, "where the player is Henry Riggs, and where it's the Major Leagues, and for total splits only (no lefty/righty stuff... and... from years where another table says Henry Riggs was in the All Star Game". That's kind of complicated just to spit out, and the way you have to do it with SQL is with that nested SELECT statement, but as you can see, if you just do it one little step at a time, it's really not complicated at all.

Anyway though, we don't want to bring back *only* years where Riggs made the ASG, we want to add a star to the end of the year where that's the case. So, let's go ahead and yoink that nested SELECT from the WHERE clause and go up to at-bats. Adding an asterisk is pretty simple and we've basically done something similar with innings pitched:

Code:
	concat(bat.year, '*') as year,
I'm not going to return show the entire data set; suffice it to say, every year has an asterisk at the end now. Also not useful but it's a start.

So... in order to make the asterisk appear on some items and not others, we've got to use an IF() statement. I looked this up on w3schools - I work with MS-SQL, not mySQL, and sometimes the syntax is a little different - and this is the format you have to use:

SELECT IF(500<1000, "YES", "NO");

So... anything inside of parentheses, we can nest. Again, do NOT attempt to wrap your brain around the entire thing! Anyway, let's add an if statement that's always true into that concat() statement above to make sure everything works right...

Code:
	concat(bat.year, 
		if(true, "*", "")) as year,
Yep, as expected, it does the same thing as the last bit of code does.

NOW FOR THE BIG MONSTER REVEAL... let's take that statement we added to the where clause - all by itself, it essentially returns a "true" or "false" - and see what happens now!

Code:
	concat(bat.year, 
			if(bat.year in (SELECT 
							asg.year 
							FROM `modern-ish-baseball`.league_history_all_star asg
							where 
								asg.all_star = @player_id
								and asg.league_id = 100
							),
				"*", "")
            ) as year,
    tm.abbr as team,
LA REVEAL MAGNIFICO

Code:
year	team	g	ab	r	h	2B	3B	hr	rbi	bb	ibb	k	hpb	sh	sf	sb	cs	gdp	ba	obp	slg
1954	ATL	70	243	23	66	10	1	4	29	21	2	28	1	0	4	1	0	6	0.272	0.327	0.370
1955	ATL	84	288	31	83	14	1	13	48	23	1	36	2	0	6	0	0	4	0.288	0.339	0.479
1956*	ATL	125	454	70	142	21	4	20	64	58	2	32	0	0	4	1	0	13	0.313	0.388	0.509
1957*	ATL	152	552	99	182	38	1	23	79	65	4	30	0	0	7	0	0	14	0.330	0.396	0.527
1958*	ATL	151	570	98	192	29	2	27	94	77	3	42	0	0	5	0	0	10	0.337	0.413	0.537
1959*	ATL	153	577	96	188	28	1	37	96	72	6	47	2	0	4	0	0	11	0.326	0.399	0.570
1960*	ATL	140	499	79	173	25	2	24	72	74	12	20	0	0	3	1	1	9	0.347	0.429	0.549
1961*	ATL	142	527	94	175	33	1	28	88	76	7	23	1	0	6	2	2	20	0.332	0.413	0.558
1962*	ATL	148	540	103	188	26	3	33	92	76	7	21	1	0	4	1	0	12	0.348	0.427	0.591
1963*	ATL	113	423	87	144	20	0	37	104	57	2	15	1	0	6	0	2	15	0.340	0.415	0.650
1964*	ATL	85	314	63	108	17	1	16	41	39	2	20	1	0	1	0	1	12	0.344	0.417	0.557
1965*	ATL	159	558	108	177	17	1	45	116	95	9	29	1	0	5	1	1	16	0.317	0.414	0.593
1966*	ATL	157	573	111	191	21	0	44	119	88	9	38	0	0	4	1	4	14	0.333	0.420	0.600
1967*	ATL	127	465	79	136	17	0	29	72	69	12	40	1	0	5	0	0	12	0.292	0.381	0.516
1968*	ATL	157	563	86	149	22	4	25	77	74	6	56	3	0	6	1	0	15	0.265	0.350	0.451
1969	ATL	112	397	78	131	15	0	32	79	79	7	26	1	0	5	0	0	8	0.330	0.438	0.610
Totals		2075	7543	1305	2425	353	22	437	1270	1043	91	503	15	0	75	9	11	191	0.321	0.401	0.548
Okay, that's a lot of work for one measly character, I will admit. But imagine if you had to do 500 of those. I have to say that doing something like manually adding asterisks to players if they played in the All Star Game is a thing that's going to feel tedious to me by, like, the 2nd or 3rd player. And yes, I am 100% a person who will spend an hour designing a way to get out of 5 minutes of busy work. But seriously, even though this might seem like a lot, it's really, really not. Or maybe it kind of is but it's just a long list of easy things to do, done one at a time.

Next up... how about an awards column like they have on BBRef, where it might say "AS, MVP". UGH, this will already be a pain in the butt...
__________________
Quote:
Originally Posted by Markus Heinsohn
You bastard....
The Great American Baseball Thrift Book - Like reading the Sporting News from back in the day, only with fake players. REAL LIFE DRAMA THOUGH maybe not
Syd Thrift is offline   Reply With Quote
Old 12-14-2021, 09:25 PM   #11
Syd Thrift
Hall Of Famer
 
Syd Thrift's Avatar
 
Join Date: May 2004
Posts: 10,430
That awards column

All right, so for starters, we're going to need to get a lay of the land so to speak with the data that we have. There is a table called players_awards that outputs some good data but, well, some of it is in strange formats... here's our good buddy Henry Riggs:

Code:
SELECT * FROM `modern-ish-baseball`.players_awards
	where
		league_id = 100
		and player_id = 7620;
Code:
7620	100	7	1	0	1956	1956	0	14	5	1
7620	100	7	1	9	1956	1956	0	8	7	1
7620	100	7	1	11	1956	1956	9	5	11	1
7620	100	7	1	0	1957	1957	0	13	5	1
7620	100	7	1	9	1957	1957	0	7	7	1
7620	100	7	1	11	1957	1957	9	4	11	1
7620	100	7	1	9	1958	1958	0	6	7	1
7620	100	7	1	0	1958	1958	0	8	9	1
7620	100	7	1	11	1958	1958	9	4	11	1
7620	100	7	1	5	1958	1958	0	8	11	1
7620	100	7	1	0	1959	1959	0	27	4	1
7620	100	7	1	2	1959	1959	0	1	5	1
7620	100	7	1	9	1959	1959	0	5	7	1
7620	100	7	1	2	1959	1959	0	1	9	1
7620	100	7	1	11	1959	1959	9	5	11	1
7620	100	7	1	5	1959	1959	0	9	11	2
7620	100	7	1	9	1960	1960	0	9	7	1
7620	100	7	1	0	1960	1960	0	22	8	1
7620	100	7	1	2	1960	1960	0	1	9	1
7620	100	7	1	0	1960	1960	0	12	9	1
7620	100	7	1	11	1960	1960	9	2	11	1
7620	100	7	1	5	1960	1960	0	6	11	1
7620	100	7	1	9	1961	1961	0	9	7	1
7620	100	7	1	0	1961	1961	0	21	8	1
7620	100	7	1	2	1961	1961	0	1	9	1
7620	100	7	1	2	1961	1961	0	1	10	1
7620	100	7	1	11	1961	1961	9	30	10	1
7620	100	7	1	9	1962	1962	0	8	7	1
7620	100	7	1	0	1962	1962	0	9	7	1
7620	100	7	1	2	1962	1962	0	1	8	1
7620	100	7	1	11	1962	1962	9	8	11	1
7620	100	7	1	0	1963	1963	0	15	4	1
7620	100	7	1	0	1963	1963	0	10	6	1
7620	100	7	1	9	1963	1963	0	7	7	1
7620	100	7	1	11	1963	1963	9	13	11	1
7620	100	7	1	9	1964	1964	0	5	7	1
7620	100	7	1	0	1965	1965	0	28	6	1
7620	100	7	1	2	1965	1965	0	1	7	1
7620	100	7	1	9	1965	1965	0	11	7	1
7620	100	7	1	0	1965	1965	0	19	7	1
7620	100	7	1	0	1965	1965	0	6	9	1
7620	100	7	1	2	1965	1965	0	1	10	1
7620	100	7	1	11	1965	1965	9	25	10	1
7620	100	7	1	5	1965	1965	0	29	10	3
7620	100	7	1	2	1966	1966	0	1	5	1
7620	100	7	1	0	1966	1966	0	30	5	1
7620	100	7	1	2	1966	1966	0	1	6	1
7620	100	7	1	9	1966	1966	0	10	7	1
7620	100	7	1	2	1966	1966	0	1	10	1
7620	100	7	1	11	1966	1966	9	21	10	1
7620	100	7	1	5	1966	1966	0	25	10	1
7620	100	7	1	9	1967	1967	0	9	7	1
7620	100	7	1	0	1967	1967	0	10	7	1
7620	100	7	1	9	1968	1968	0	7	7	1
7620	100	7	1	0	1968	1968	0	29	7	1
7620	100	7	1	0	1969	1969	0	16	6	1
7620	100	7	1	9	1969	1969	0	21	7	1
7620	100	7	1	0	1969	1969	0	1	9	1
So... ferretting this out from looking at Riggs' "Accomplishments" page on his player card in the game, I can see that the award_id is the important field and it corresponds to the following awards (although I also took a look at some veteran pitchers and super fielders to get the ones that aren't listed for Riggs:

0 - Player of the Week (no thanks)
1 - Pitcher of the Month (nah)
2 - Batter of the Month (also no thanks)
4 - Cy Young ("finish" has to = 1)
5 - MVP (also, "finish" has to = 1)
6 - Rookie of the Year ("finish" = 1)
7 - Gold Glove (position is listed)
9 - All-Star (no position listed)
11- Silver Slugger (position is listed)
13 - Reliever of the Year (also needs finalist = 1)
14 - World Series champion (?)
15 - World Series MVP (? but probably not)

That seems to run the gamut, or at least encapsulates all the awards I want to use.

Okay, so let's start simple. Let's create a data set for Henry Riggs that shows the abbreviation "AS" for every season he went to the ASG...

Code:
SELECT year,
		'AS'
    FROM `modern-ish-baseball`.players_awards awd
	where
		league_id = 100
        and player_id = 7620
        and award_id = 9;
Doing similar things for MVP and Silver Slugger awards:

Code:
SELECT year,
		'SS'
    FROM `modern-ish-baseball`.players_awards awd
	where
		league_id = 100
        and player_id = 7620
        and award_id = 11;

SELECT year,
		'MVP'
    FROM `modern-ish-baseball`.players_awards awd
	where
		league_id = 100
        and player_id = 7620
        and award_id = 5
        and finish = 1;
I really wish there was something in the database I could use outside of hardcoding those text values but there does not seem to be. Anyway, we can put these all together into one row per season as well, doing more or less the things we did with the asterisk above... first, though, let's get a general lay of the land in terms of how we want our "intermediate" table to look...

[code]
SELECT year,
'MVP' as mvp,
'AS' as allstar,
'GG' as goldGlove,
'SS' as silverSlugger
FROM `modern-ish-baseball`.players_awards awd
where
league_id = 100
and player_id = 7620
and award_id = 9
and finish = 1;
[code]

That returns a thing that looks like this:

Code:
# year	mvp	allstar	goldGlove	silverSlugger
1956	MVP	AS	GG	SS
1957	MVP	AS	GG	SS
1958	MVP	AS	GG	SS
1959	MVP	AS	GG	SS
1960	MVP	AS	GG	SS
1961	MVP	AS	GG	SS
1962	MVP	AS	GG	SS
1963	MVP	AS	GG	SS
1964	MVP	AS	GG	SS
1965	MVP	AS	GG	SS
1966	MVP	AS	GG	SS
1967	MVP	AS	GG	SS
1968	MVP	AS	GG	SS
1969	MVP	AS	GG	SS
I'll add more awards later. ANYWAY, this is obviously not what we want just yet, but we can use IF() statements (also see above) for years that apply...

(also I see that Riggs was awarded with a trip to the game but did not play in it in 1969. BACK TO THE DRAWING BOARD WITH THE PREVIOUS REPORT maybe)

Just to make sure I got the syntax right:

Code:
SELECT year,
		if(false, 'MVP', '') as mvp,
		'AS' as allstar,
        'GG' as goldGlove,
        'SS' as silverSlugger
    FROM `modern-ish-baseball`.players_awards awd
	where
		league_id = 100
        and player_id = 7620
        and award_id = 9
        and finish = 1;
Code:
SET @player_id = 7620;

SELECT awd.year,
		if(awd.year in (
				select year
				FROM `modern-ish-baseball`.players_awards awd
				where
					league_id = 100
					and player_id = 7620
					and award_id = 5
					and finish = 1
			)
		, 
        'MVP', '') as mvp,
		'AS' as allstar,
        'GG' as goldGlove,
        'SS' as silverSlugger
    FROM `modern-ish-baseball`.players_awards awd
	where
		awd.league_id = 100
        and awd.player_id = player_id;
Oops! Already there's a problem. Removing the "and award_id = 9" clause was necessary because you could have a season where you win the MVP or any of these other awards but don't make it to the All-Star Game. However, if you look up to see how the table is created, it's got one award per line (this is the proper way a table like this should be formed, by the way; players-to-awards is what database nerds call a "many to many" relationship - many players, several different awards - and so you should always use what's called a "mapping table" that maps players - actually players + time periods - to awards). But we don't need one line! We need one line per distinct year. Fortunately, SQL has a way of bringing that data back...

Code:
SET @player_id = 7620;

SELECT awdYears.year as awardYear,
		if(awdYears.year in (
				select year
				FROM `modern-ish-baseball`.players_awards awd
				where
					league_id = 100
					and player_id = 7620
					and award_id = 5
					and finish = 1
			)
		, 
        'MVP', '') as mvp,
		'AS' as allstar,
        'GG' as goldGlove,
        'SS' as silverSlugger
	FROM (SELECT DISTINCT year FROM league_history a) as awdYears
	order by awardYear
Code:
# awardYear	mvp	allstar	goldGlove	silverSlugger
1946		AS	GG	SS
1947		AS	GG	SS
1948		AS	GG	SS
1949		AS	GG	SS
1950		AS	GG	SS
1951		AS	GG	SS
1952		AS	GG	SS
1953		AS	GG	SS
1954		AS	GG	SS
1955		AS	GG	SS
1956		AS	GG	SS
1957		AS	GG	SS
1958	MVP	AS	GG	SS
1959		AS	GG	SS
1960	MVP	AS	GG	SS
1961		AS	GG	SS
1962		AS	GG	SS
1963		AS	GG	SS
1964		AS	GG	SS
1965		AS	GG	SS
1966	MVP	AS	GG	SS
1967		AS	GG	SS
1968		AS	GG	SS
1969		AS	GG	SS
OK, let's talk through that. I created that little subtable called "awdYears" that is nothing but a list of all the different years that are in the players_awards table. Since my league started in 1946, that's basically just 1946 to now. Now that we're asking the players_awards table if the player in question won an award in separate IF statements, we no longer need to join* those tables so we can lop that off as well. As you can see though we now have a list of all years with the word "MVP" in the "MVP" column if the player won that award. Baby steps!

*I should talk about "join" statements... which I'd totally do except that basically everyone just uses LEFT JOIN and if you use anything else and it's not the year 1982 or earlier you are a bad person and should feel bad.

All right, though! So let's add in our code for the other 3 awards (and please note that I went through this one award at a time. I found an issue in there but I think it was related to my cutting the wrong part of code so I won't go through that except to re-re-re-re-reiterate that small chunks = the best. The other nice thing about small chunks is that if you mess something up and can't work your way out of it, you can just undo everything you did up to the point of your last successful run.)

Code:
SET @player_id = 7620;

SELECT awdYears.year as awardYear,
		if(awdYears.year in (
				select year
				FROM `modern-ish-baseball`.players_awards ma
				where
					ma.league_id = 100
					and ma.player_id = @player_id
					and ma.award_id = 5
					and ma.finish = 1
			), 
        'MVP', '') as mvp,
		if(awdYears.year in (
				select year
				FROM `modern-ish-baseball`.players_awards aa
				where
					aa.league_id = 100
					and aa.player_id = @player_id
					and aa.award_id = 9
				), 
        'AS', '') as allstar,
		if(awdYears.year in (
				select year
				FROM `modern-ish-baseball`.players_awards aa
				where
					aa.league_id = 100
					and aa.player_id = @player_id
					and aa.award_id = 7
				), 
        'GG', '') as goldglove, 
		if(awdYears.year in (
				select year
				FROM `modern-ish-baseball`.players_awards aa
				where
					aa.league_id = 100
					and aa.player_id = @player_id
					and aa.award_id = 11
				), 
        'SS', '') as silverslugger
	FROM (SELECT DISTINCT a.year FROM league_history a) as awdYears
	order by awardYear
Code:
year	mvp	as	gs	ss
1946				
1947				
1948				
1949				
1950				
1951				
1952				
1953				
1954				
1955				
1956		AS		SS
1957		AS		SS
1958	MVP	AS		SS
1959		AS		SS
1960	MVP	AS		SS
1961		AS		SS
1962		AS		SS
1963		AS		SS
1964		AS		
1965		AS		SS
1966	MVP	AS		SS
1967		AS		
1968		AS		
1969		AS
In my league, Hammerin' Hank is not known for his fielding.

Anyway, after doing a bit of further research into what I'm trying to do, I realized that I need to put null values into unused rows rather than empty strings. There's a particular function you can use in MySQL called concat_ws() that will allow us to link up all of these columns but ignore it when there are no values present. The syntax looks like this:

SELECT CONCAT_WS("-", "SQL", "Tutorial", "is", "fun!") AS ConcatenatedString;

OK! So... let's quickly create a table that looks like the end result should look, kind of...

Code:
SELECT 
	awdYears.year,
	(select
		CONCAT_WS(",", awards.mvp, awards.allstar, awards.gold, awards.silver)
        from ( select 'MVP' as mvp, 'AS' as allstar, null as gold, 'SS' as silver) as awards) as awards
FROM (SELECT DISTINCT a.year FROM league_history a) as awdYears
That worked! And now, the brand new MOMENT OF TRUTH:

Code:
SET @player_id = 7620;

SELECT 
	awdYears.year,
	(select
		CONCAT_WS(",", awards.mvp, awards.allstar, awards.goldglove, awards.silverslugger)
        from 
			(SELECT awdYears.year as awardYear,
				if(awdYears.year in (
						select year
						FROM `modern-ish-baseball`.players_awards ma
						where
							ma.league_id = 100
							and ma.player_id = @player_id
							and ma.award_id = 5
							and ma.finish = 1
					), 
				'MVP', null) as mvp,
				if(awdYears.year in (
						select year
						FROM `modern-ish-baseball`.players_awards aa
						where
							aa.league_id = 100
							and aa.player_id = @player_id
							and aa.award_id = 9
						), 
				'AS', null) as allstar,
				if(awdYears.year in (
						select year
						FROM `modern-ish-baseball`.players_awards aa
						where
							aa.league_id = 100
							and aa.player_id = @player_id
							and aa.award_id = 7
						), 
				'GG', null) as goldglove, 
				if(awdYears.year in (
						select year
						FROM `modern-ish-baseball`.players_awards aa
						where
							aa.league_id = 100
							and aa.player_id = @player_id
							and aa.award_id = 11
						), 
				'SS', null) as silverslugger
		FROM (SELECT DISTINCT a.year FROM league_history a) as awdYears
        ) as awards where awards.awardyear = awdYears.year
	) as awards
FROM (SELECT DISTINCT a.year FROM league_history a) as awdYears;
Code:
1946	
1947	
1948	
1949	
1950	
1951	
1952	
1953	
1954	
1955	
1956	AS,SS
1957	AS,SS
1958	MVP,AS,SS
1959	AS,SS
1960	MVP,AS,SS
1961	AS,SS
1962	AS,SS
1963	AS,SS
1964	AS
1965	AS,SS
1966	MVP,AS,SS
1967	AS
1968	AS
1969	AS
Admittedly I made that a lot harder on myself than it needed to be; I copied all of the code into the place where it'd be formatted in the concat_w() function like I was supposed to but I got really badly mixed up by the parentheticals. It happens to everyone! SQL is a massive PITA! Anyway, though, once I get this all set up for the other awards, I'll try and run through how to add it to the batter history screen... which shouldn't be that hard, it's just, a lot more copying and pasting and making the parentheses come out right. Because SQL.
__________________
Quote:
Originally Posted by Markus Heinsohn
You bastard....
The Great American Baseball Thrift Book - Like reading the Sporting News from back in the day, only with fake players. REAL LIFE DRAMA THOUGH maybe not

Last edited by Syd Thrift; 12-14-2021 at 09:33 PM.
Syd Thrift is offline   Reply With Quote
Old 12-14-2021, 09:36 PM   #12
Syd Thrift
Hall Of Famer
 
Syd Thrift's Avatar
 
Join Date: May 2004
Posts: 10,430
I'm about SQLed out for the night but here's the final draft of the awards string... it includes MVP, Cy, ROY, All-Star, Gold Glove, Silver Slugger, and Reliever of the Year (which I call the Rolaids Relief award and which is nicknamed "RR" here):

Code:
SET @player_id = 7616;

SELECT 
	awdYears.year,
	(select
		CONCAT_WS(",", awards.mvp, awards.cy_young, awards.roy, awards.allstar, awards.goldglove, awards.silverslugger, awards.rolaids)
        from 
			(SELECT awdYears.year as awardYear,
				if(awdYears.year in (
						select year
						FROM `modern-ish-baseball`.players_awards ma
						where
							ma.league_id = 100
							and ma.player_id = @player_id
							and ma.award_id = 5
							and ma.finish = 1
					), 
				'MVP', null) as mvp,
				if(awdYears.year in (
						select year
						FROM `modern-ish-baseball`.players_awards ma
						where
							ma.league_id = 100
							and ma.player_id = @player_id
							and ma.award_id = 4
							and ma.finish = 1
					), 
				'CY', null) as cy_young,
				if(awdYears.year in (
						select year
						FROM `modern-ish-baseball`.players_awards ma
						where
							ma.league_id = 100
							and ma.player_id = @player_id
							and ma.award_id = 6
							and ma.finish = 1
					), 
				'ROY', null) as roy,
				if(awdYears.year in (
						select year
						FROM `modern-ish-baseball`.players_awards aa
						where
							aa.league_id = 100
							and aa.player_id = @player_id
							and aa.award_id = 9
						), 
				'AS', null) as allstar,
				if(awdYears.year in (
						select year
						FROM `modern-ish-baseball`.players_awards aa
						where
							aa.league_id = 100
							and aa.player_id = @player_id
							and aa.award_id = 7
						), 
				'GG', null) as goldglove, 
				if(awdYears.year in (
						select year
						FROM `modern-ish-baseball`.players_awards aa
						where
							aa.league_id = 100
							and aa.player_id = @player_id
							and aa.award_id = 11
						), 
				'SS', null) as silverslugger,
                if(awdYears.year in (
						select year
						FROM `modern-ish-baseball`.players_awards ma
						where
							ma.league_id = 100
							and ma.player_id = @player_id
							and ma.award_id = 13
							and ma.finish = 1
					), 
				'RR', null) as rolaids
		FROM (SELECT DISTINCT a.year FROM league_history a) as awdYears
        ) as awards where awards.awardyear = awdYears.year
	) as awardString
FROM (SELECT DISTINCT a.year FROM league_history a) as awdYears;
__________________
Quote:
Originally Posted by Markus Heinsohn
You bastard....
The Great American Baseball Thrift Book - Like reading the Sporting News from back in the day, only with fake players. REAL LIFE DRAMA THOUGH maybe not
Syd Thrift is offline   Reply With Quote
Old 12-15-2021, 10:47 AM   #13
Syd Thrift
Hall Of Famer
 
Syd Thrift's Avatar
 
Join Date: May 2004
Posts: 10,430
Just a couple of quick bits:

For the team batting report I used a switch/case statement (I guess in SQL it's just a CASE) to change the position IDs to position names, and I also added a filter for positions so if you really want to output pitcher hitting you can. The whole code below:

Code:
SET @year = 1969, @team_abbr='NYY', @positions='3,4,5,6';

SELECT
	CASE
		WHEN fld.position = 2 THEN "C"
		WHEN fld.position = 3 THEN "1B"
		WHEN fld.position = 4 THEN "2B"
		WHEN fld.position = 5 THEN "3B"
		WHEN fld.position = 6 THEN "SS"
		WHEN fld.position = 7 THEN "LF"
		WHEN fld.position = 8 THEN "CF"
		WHEN fld.position = 9 THEN "RF"        
        ELSE "DH"
    END as pos,
	plr.first_name,
    plr.last_name,
    bat.g,
    bat.ab,
    bat.r,
    bat.h,
    bat.d as 2B,
    bat.t as 3B,
    bat.hr,
    bat.rbi,
    bat.bb,
    bat.ibb,
    bat.k,
    bat.hp as hpb,
    bat.sh,
    bat.sf,
    bat.sb,
    bat.cs,
    bat.gdp,
    round(bat.h / bat.ab, 3) as ba,
    round((bat.h + bat.bb + bat.hp) / (bat.h + bat.bb + bat.hp + bat.sh + bat.sf), 3) as obp,
    round((bat.h + bat.d + 2 * bat.t + 3 * bat.hr) / ab, 3) as slg
FROM `modern-ish-baseball`.teams tm 
	LEFT JOIN players_career_batting_stats bat
		ON bat.team_id = tm.team_id
	LEFT JOIN players plr
		ON plr.player_id = bat.player_id
    LEFT JOIN players_career_fielding_stats fld
		ON fld.player_id = plr.player_id and fld.year = @year and bat.team_id = fld.team_id
where tm.abbr = @team_abbr
	AND bat.year = @year
    AND bat.split_id = 1
    AND fld.ip = (select max(ip) from `modern-ish-baseball`.players_career_fielding_stats f
	WHERE f.year = @year and f.team_id = tm.team_id and f.player_id = plr.player_id)
    AND find_in_set(fld.position, @positions)
ORDER BY fld.position, bat.pa desc
And, I even discovered that the Workbench has an "output to text" function that means that there's no need to play with formatting if you just want to copy and paste into a code tag:

Code:
+ -------- + --------------- + -------------- + ------ + ------- + ------ + ------ + ------- + ------- + ------- + -------- + ------- + -------- + ------ + -------- + ------- + ------- + ------- + ------- + -------- + ------- + -------- + -------- +
| pos      | first_name      | last_name      | g      | ab      | r      | h      | 2B      | 3B      | hr      | rbi      | bb      | ibb      | k      | hpb      | sh      | sf      | sb      | cs      | gdp      | ba      | obp      | slg      |
+ -------- + --------------- + -------------- + ------ + ------- + ------ + ------ + ------- + ------- + ------- + -------- + ------- + -------- + ------ + -------- + ------- + ------- + ------- + ------- + -------- + ------- + -------- + -------- +
| 1B       | Justin          | Richens        | 63     | 231     | 34     | 71     | 17      | 2       | 9       | 26       | 41      | 3        | 14     | 6        | 0       | 5       | 0       | 0       | 6        | 0.307   | 0.959    | 0.515    |
| 1B       | Jose            | Ayala          | 51     | 193     | 24     | 55     | 9       | 0       | 5       | 23       | 13      | 2        | 34     | 1        | 0       | 4       | 2       | 0       | 10       | 0.285   | 0.945    | 0.409    |
| 1B       | John            | Chapman        | 60     | 138     | 13     | 31     | 6       | 0       | 3       | 14       | 26      | 1        | 16     | 1        | 0       | 2       | 0       | 0       | 0        | 0.225   | 0.967    | 0.333    |
| 1B       | Bobby           | Berg           | 12     | 23      | 7      | 8      | 2       | 0       | 2       | 8        | 6       | 0        | 0      | 0        | 0       | 0       | 0       | 0       | 2        | 0.348   | 1.000    | 0.696    |
| 2B       | Wing-fung       | Yi             | 105    | 383     | 44     | 117    | 17      | 6       | 6       | 45       | 67      | 5        | 29     | 2        | 0       | 3       | 12      | 15      | 4        | 0.305   | 0.984    | 0.428    |
| 2B       | Elijah          | Patton         | 38     | 92      | 10     | 19     | 4       | 0       | 0       | 12       | 9       | 3        | 13     | 0        | 0       | 2       | 0       | 1       | 5        | 0.207   | 0.933    | 0.250    |
| 2B       | Chris           | Eckert         | 16     | 33      | 4      | 7      | 1       | 0       | 0       | 3        | 4       | 2        | 7      | 0        | 0       | 0       | 0       | 0       | 2        | 0.212   | 1.000    | 0.242    |
| 2B       | Nate            | Lieb           | 3      | 9       | 1      | 2      | 0       | 0       | 1       | 2        | 2       | 1        | 3      | 0        | 0       | 0       | 0       | 0       | 0        | 0.222   | 1.000    | 0.556    |
| 3B       | Tom             | Weiss          | 128    | 423     | 62     | 115    | 12      | 2       | 17      | 69       | 60      | 3        | 38     | 7        | 1       | 3       | 0       | 0       | 8        | 0.272   | 0.978    | 0.430    |
| 3B       | David           | Salinas        | 49     | 169     | 14     | 36     | 5       | 0       | 1       | 18       | 15      | 0        | 12     | 1        | 0       | 2       | 2       | 2       | 8        | 0.213   | 0.963    | 0.260    |
| SS       | Ty              | Stover         | 146    | 530     | 85     | 127    | 24      | 6       | 25      | 92       | 96      | 1        | 105    | 3        | 0       | 7       | 1       | 1       | 19       | 0.240   | 0.970    | 0.449    |
| SS       | Jason           | Wichert        | 28     | 72      | 4      | 14     | 2       | 0       | 0       | 6        | 3       | 0        | 18     | 0        | 0       | 2       | 1       | 0       | 2        | 0.194   | 0.895    | 0.222    |
| SS       | Chris           | Grube          | 10     | 28      | 0      | 1      | 0       | 0       | 0       | 1        | 2       | 1        | 1      | 0        | 0       | 0       | 0       | 0       | 0        | 0.036   | 1.000    | 0.036    |
+ -------- + --------------- + -------------- + ------ + ------- + ------ + ------ + ------- + ------- + ------- + -------- + ------- + -------- + ------ + -------- + ------- + ------- + ------- + ------- + -------- + ------- + -------- + -------- +
For a little further cleanup, replacing the first_name and last_name lines with another concat:

concat(plr.first_name, ' ', plr.last_name) as name,

Gives you this:

Code:
+ -------- + --------- + ------ + ------- + ------ + ------ + ------- + ------- + ------- + -------- + ------- + -------- + ------ + -------- + ------- + ------- + ------- + ------- + -------- + ------- + -------- + -------- +
| pos      | name      | g      | ab      | r      | h      | 2B      | 3B      | hr      | rbi      | bb      | ibb      | k      | hpb      | sh      | sf      | sb      | cs      | gdp      | ba      | obp      | slg      |
+ -------- + --------- + ------ + ------- + ------ + ------ + ------- + ------- + ------- + -------- + ------- + -------- + ------ + -------- + ------- + ------- + ------- + ------- + -------- + ------- + -------- + -------- +
| 1B       | Justin Richens | 63     | 231     | 34     | 71     | 17      | 2       | 9       | 26       | 41      | 3        | 14     | 6        | 0       | 5       | 0       | 0       | 6        | 0.307   | 0.959    | 0.515    |
| 1B       | Jose Ayala | 51     | 193     | 24     | 55     | 9       | 0       | 5       | 23       | 13      | 2        | 34     | 1        | 0       | 4       | 2       | 0       | 10       | 0.285   | 0.945    | 0.409    |
| 1B       | John Chapman | 60     | 138     | 13     | 31     | 6       | 0       | 3       | 14       | 26      | 1        | 16     | 1        | 0       | 2       | 0       | 0       | 0        | 0.225   | 0.967    | 0.333    |
| 1B       | Bobby Berg | 12     | 23      | 7      | 8      | 2       | 0       | 2       | 8        | 6       | 0        | 0      | 0        | 0       | 0       | 0       | 0       | 2        | 0.348   | 1.000    | 0.696    |
| 2B       | Wing-fung Yi | 105    | 383     | 44     | 117    | 17      | 6       | 6       | 45       | 67      | 5        | 29     | 2        | 0       | 3       | 12      | 15      | 4        | 0.305   | 0.984    | 0.428    |
| 2B       | Elijah Patton | 38     | 92      | 10     | 19     | 4       | 0       | 0       | 12       | 9       | 3        | 13     | 0        | 0       | 2       | 0       | 1       | 5        | 0.207   | 0.933    | 0.250    |
| 2B       | Chris Eckert | 16     | 33      | 4      | 7      | 1       | 0       | 0       | 3        | 4       | 2        | 7      | 0        | 0       | 0       | 0       | 0       | 2        | 0.212   | 1.000    | 0.242    |
| 2B       | Nate Lieb | 3      | 9       | 1      | 2      | 0       | 0       | 1       | 2        | 2       | 1        | 3      | 0        | 0       | 0       | 0       | 0       | 0        | 0.222   | 1.000    | 0.556    |
| 3B       | Tom Weiss | 128    | 423     | 62     | 115    | 12      | 2       | 17      | 69       | 60      | 3        | 38     | 7        | 1       | 3       | 0       | 0       | 8        | 0.272   | 0.978    | 0.430    |
| 3B       | David Salinas | 49     | 169     | 14     | 36     | 5       | 0       | 1       | 18       | 15      | 0        | 12     | 1        | 0       | 2       | 2       | 2       | 8        | 0.213   | 0.963    | 0.260    |
| SS       | Ty Stover | 146    | 530     | 85     | 127    | 24      | 6       | 25      | 92       | 96      | 1        | 105    | 3        | 0       | 7       | 1       | 1       | 19       | 0.240   | 0.970    | 0.449    |
| SS       | Jason Wichert | 28     | 72      | 4      | 14     | 2       | 0       | 0       | 6        | 3       | 0        | 18     | 0        | 0       | 2       | 1       | 0       | 2        | 0.194   | 0.895    | 0.222    |
| SS       | Chris Grube | 10     | 28      | 0      | 1      | 0       | 0       | 0       | 1        | 2       | 1        | 1      | 0        | 0       | 0       | 0       | 0       | 0        | 0.036   | 1.000    | 0.036    |
+ -------- + --------- + ------ + ------- + ------ + ------ + ------- + ------- + ------- + -------- + ------- + -------- + ------ + -------- + ------- + ------- + ------- + ------- + -------- + ------- + -------- + -------- +
Welp, that didn't work! I fiddled around a little with trying to cast the "name" field to, for instance, a char(35) type but it just does not want to not truncate the values for this report. As I suspect this is regarded as gravy by the developers and not really a thing you're supposed to use unto itself, I'll let it slide, I guess!
__________________
Quote:
Originally Posted by Markus Heinsohn
You bastard....
The Great American Baseball Thrift Book - Like reading the Sporting News from back in the day, only with fake players. REAL LIFE DRAMA THOUGH maybe not

Last edited by Syd Thrift; 12-15-2021 at 11:00 AM.
Syd Thrift is offline   Reply With Quote
Old 08-09-2023, 09:21 PM   #14
mjkuhlen
Minors (Rookie Ball)
 
Join Date: Jul 2017
Posts: 21
SQL Fields

Hey, this thread is great! Do you have a reference to what fields are in each of the SQL tables? I'm starting to poke around now but it would be easier to know what each table is holding.

Thanks!
mjkuhlen is offline   Reply With Quote
Old 01-18-2024, 06:48 PM   #15
Syd Thrift
Hall Of Famer
 
Syd Thrift's Avatar
 
Join Date: May 2004
Posts: 10,430
Quote:
Originally Posted by mjkuhlen View Post
Hey, this thread is great! Do you have a reference to what fields are in each of the SQL tables? I'm starting to poke around now but it would be easier to know what each table is holding.

Thanks!
Sorry for the delayed turnaround. I don't really have a reference per se but in most cases I think they're obvious, especially if you pull up MySQL alongside your copy of OOTP and do a query or two.
__________________
Quote:
Originally Posted by Markus Heinsohn
You bastard....
The Great American Baseball Thrift Book - Like reading the Sporting News from back in the day, only with fake players. REAL LIFE DRAMA THOUGH maybe not
Syd Thrift is offline   Reply With Quote
Old 01-18-2024, 07:20 PM   #16
Syd Thrift
Hall Of Famer
 
Syd Thrift's Avatar
 
Join Date: May 2004
Posts: 10,430
I've changed up the way I do the "reports" part of my dynasty this year and I wanted to share the DB tools I've been using to get that data.

First up, I'm getting the list of the teams in Major League Baseball along with some extra data using this:

Code:
select team_id
		,name
        ,nickname
        ,background_color_id
        ,text_color_id
	from teams
	where level = 1
    order by name
This brings in the team_id, which I'm using elsewhere, plus the name/nickname, and then, because I use this to color the names of stuff in the reports, the background color ID and, theoretically, the text color ID. Pretty straightforward!

The next thing I have, which I re-run for every team, is this roster "getter":

Code:
set @year = 1972, @team_id = 13;

select distinct tm.team_id
        ,tr.player_id
        ,concat(plr.first_name, ' ', plr.last_name) as name
        ,plr.position
        ,bat.pa
        ,pit.ip
from team_roster tr
	left join teams tm on tr.team_id = tm.team_id
    left join players plr on plr.player_id = tr.player_id
    left join players_career_batting_stats bat on bat.player_id = plr.player_id 
		and bat.year = @year 
        and bat.split_id = 1 
        and bat.team_id = tm.team_id
    left join players_career_pitching_stats pit on pit.player_id = plr.player_id 
		and pit.year = @year 
        and pit.split_id = 1
        and pit.team_id = tm.team_id
where tm.team_id = @team_id
order by position, ip desc, pa desc
limit 250
I realize as I'm reading through this that the left join on teams is entirely superfluous but I've got it there just in case. The team_roster table is what is called a mapping table that describes a many-to-many relationship in the data. In this case, there are many team rosters, and each team roster has many players. This is pretty standard; there are different ways of handling one-to-one and one-to-many data relationships as well. The big thing they teach you about in database school is that you want to try to duplicate data as little as possible.

Anyway, because I also wanted a quick and dirty way of seeing IPs and PAs to help determine who I should show data for, I have left joins to the pitcher and batter data as well. Note there that the left join starts with team_roster: there is only one player per team in that table so I can be OK with understanding that there is at most one line in the corresponding pitcher/batter record so long as I specify the player, the team, the year, and the split_id (1 is "regular season").

From there I started with this to show the bio data:

Code:
SET @playerID = 13749, @teamID = 5;

select concat('[color="', tm.background_color_id, '"]', plr.first_name, ' ', plr.last_name, '[/color]') as line1
		,concat('POS No. ', plr.uniform_number) as line1pt5
		,concat((
					case
						when plr.bats = 1 then 'R'
                        when plr.bats = 2 then 'L'
                        when plr.bats = 3 then 'S'
                    end), (
                    case
						when plr.throws = 1 then 'R'
                        when plr.throws = 2 then 'L'
                    end), ', '
						,truncate((plr.height / (2.54 * 12)), 0)
                        ,''''
                        ,round(((plr.height / 2.54) % 12), 0)
                        ,'" ',
                        plr.weight, ' lbs.') as line2
		,concat('Born ', plr.date_of_birth) as line3
	from players plr
		left join teams tm on tm.team_id = @teamID
    where plr.player_id = @playerID
I've since migrated this over to the pitcher/hitter scripts I'm using to save how many times I need to copy/paste the player ID but this is the gist.

This is doing a fair bit more work so I'll try and explain it... You'll note that each entry is just a concatenated string. I set this up so that I can most easily just hit enter a couple times in the right spots. The format here is cribbed from the 80s era Scouting Report books that were produced by Harmon Killebrew and Brooks Robinson among others. The one line item I'm missing is the birthplace and that is because, if I'm being honest, adding that was going to be too much of a PITA . Looking at it, I'll probably add it for the future though.

The top line also inserts a chunk of BBCode that takes the background color for the teams - usually that's a dark color - and sets that as the text color.

Setting batting and throwing should be pretty straightforward.

The next bit is height, which is a pain because the game lists it in centimeters, Markus being German and all. First I get the feet, which is basically the height in inches (so height / 2.54), divided by 12, and then we only need the integer so lop off everything else. Then I get inches, which starts with the height in inches, and then gets the *modulus* of 12 - that means, the remainder... so a guy who's listed at 5'7" in this would be 67 inches (some other number of cm but whatever), which in turn translates into 5 feet and 7 "modulus feet". This calculation occasionally churns out players who are 5'12" because it doesn't know how to round them off. I would write an if statement to handle that but to be perfectly honest I've run into it twice so far and so I just change it manually.

POS is just a placeholder for the position I'm adding. I could automate that but I don't really have a cut-and-dried algorithm for it so I just take the extra second to type in, like, "CF" or "1B/OF".

Now onto the fun stuff:

Code:
set @playerID = 5132, @teamID = 13;

select concat('[color="', tm.background_color_id, '"]', plr.first_name, ' ', plr.last_name, '[/color]') as line1
		,concat('POS No. ', plr.uniform_number) as line1pt5
		,concat((
					case
						when plr.bats = 1 then 'R'
                        when plr.bats = 2 then 'L'
                        when plr.bats = 3 then 'S'
                    end), (
                    case
						when plr.throws = 1 then 'R'
                        when plr.throws = 2 then 'L'
                    end), ', '
						,truncate((plr.height / (2.54 * 12)), 0)
                        ,''''
                        ,round(((plr.height / 2.54) % 12), 0)
                        ,'" ',
                        plr.weight, ' lbs.') as line2
		,concat('Born ', plr.date_of_birth) as line3
	from players plr
		left join teams tm on tm.team_id = @teamID
    where plr.player_id = @playerID;

select concat(pit.year, ' ', tm.abbr, ' ',(CASE
			when tm.level = 2 THEN 'AAA'
			when tm.level = 3 THEN 'AA'
            when tm.level = 4 THEN 'A'
            when tm.level = 5 THEN 'S A'
            when tm.level = 6 THEN 'R'
            else 'MLB'
		END)) as yrtmlvl
		,pit.w
        ,pit.l
	     ,pit.s as sv
        ,truncate((pit.er / (pit.outs / 3) * 9), 2) as era
        ,pit.g
        ,pit.gs
        ,pit.cg
        ,concat(pit.ip, '.', pit.ipf) as ip
        ,pit.ha as h
        ,pit.r
        ,pit.er
        ,pit.bb
        ,pit.k
FROM players_career_pitching_stats pit
	LEFT JOIN teams tm on tm.team_id = pit.team_id
WHERE pit.player_id = @playerID
	and pit.split_id = 1
    and pit.year in (1970, 1971, 1972)
ORDER BY pit.year asc
This is basically asking the game to return a short little set of statistics for the seasons 1970-72 for the player in question, split ID 1, and sorted by the year from 1970 to 1972. So far it looks like when a player plays at the same level for two teams in the same year (like when a guy gets traded) it shows them in reverse order somehow. Those have been rare enough that I just copy/paste the lines. There's no separate pitcher_career_id - the table is differentiated by several keys instead of just one primary - so it's not so easy as to just order by that. Anyway, it's relatively rare and it takes all of 2 seconds to fix.

I end up with something like this:

Code:
Yr   Tm  Lvl   W   L  Sv   ERA   G  GS  CG     IP    H   R  ER   BB   SO
1970 PHI MLB  13  13   0  3.44  34  34   7  230.0  225 106  88   73  182
1971 PHI MLB  17  14   0  3.38  39  39  11  287.0  271 121 108   85  223
1972 STL MLB  13  14   0  2.69  34  34  10  271.0  221  88  81   84  231
What I'm doing here is copying the results into VSCode, doing a CTRL-F on the tab character and changing it into 2 spaces, and then going in and manually editing down the individual rows so they all line up. This is a little bit of busy work but not a huge amount.

I should note here too that while running two SELECT statements in MS-SQL via SSMS outputs two separate windows that you can select from, MySQL just flashes the first and then shows the second, so what I need to do here is select the first part of the query, run it, and then deselect and run the whole thing.

Hitting is similar:

Code:
SET @playerID = 16201, @teamID = 13;

select concat('[color="', tm.background_color_id, '"]', plr.first_name, ' ', plr.last_name, '[/color]') as line1
		,concat('POS No. ', plr.uniform_number) as line1pt5
		,concat((
					case
						when plr.bats = 1 then 'R'
                        when plr.bats = 2 then 'L'
                        when plr.bats = 3 then 'S'
                    end), (
                    case
						when plr.throws = 1 then 'R'
                        when plr.throws = 2 then 'L'
                    end), ', '
						,truncate((plr.height / (2.54 * 12)), 0)
                        ,''''
                        ,round(((plr.height / 2.54) % 12), 0)
                        ,'" ',
                        plr.weight, ' lbs.') as line2
		,concat('Born ', plr.date_of_birth) as line3
	from players plr
		left join teams tm on tm.team_id = @teamID
    where plr.player_id = @playerID;

select concat(bat.year, ' ', tm.abbr, ' ',(CASE
			when tm.level = 2 THEN 'AAA'
			when tm.level = 3 THEN 'AA'
            when tm.level = 4 THEN 'A'
            when tm.level = 5 THEN 'S A'
            when tm.level = 6 THEN 'R'
            else 'MLB'
		END)) as yrtmlvl
		,round(bat.h / bat.ab, 3) as avg
        ,bat.g
        ,bat.ab
        ,bat.r
        ,bat.h
        ,bat.d as 2b
        ,bat.t as 3b
        ,bat.hr
        ,bat.rbi
        ,bat.bb
        ,bat.k as so
        ,bat.sb
	from  `modern-ish-baseball`.players_career_batting_stats bat
    left join  `modern-ish-baseball`.players plr on plr.player_id = bat.player_id
    left join  `modern-ish-baseball`.teams tm on bat.team_id = tm.team_id
		where split_id = 1
            and plr.player_id = @playerID
            and bat.year in (1970, 1971, 1972)
	order by year asc, level desc
...and it leaves you with something like this:

Code:
Yr   Tm  Lvl    Avg    G   AB    R    H  2B  3B  HR  RBI   BB   SO  SB
1970 LEW S A  0.000    4   15    0    0   0   0   0    0    0    3   0
1970 MOD A    0.280    9   25    4    7   1   0   3    6    8    7   0
1970 CR  A    0.223   26   94   11   21   5   0   1    9   12   21   0
1970 STP A    0.242   37  128    7   31   5   0   1   15   12   31   1
1971 CR  A    0.204   94  314   30   64   8   1   4   30   34   71   0
1971 ARK AA   0.241   17   58    3   14   4   0   0    8    5   13   0
1972 TUL AAA  0.297   47  155   19   46   6   1   4   16   19   27   0
1972 STL MLB  0.213   41  141    7   30   6   0   1   12   12   25   0
If I really wanted to take the time to drop the leading 0 off of batting average, I could convert it into a varchar and then lop the first character off of it. Occasionally that would mean that guys who hit 1.000 would show up as .000 but how often does that happen?
__________________
Quote:
Originally Posted by Markus Heinsohn
You bastard....
The Great American Baseball Thrift Book - Like reading the Sporting News from back in the day, only with fake players. REAL LIFE DRAMA THOUGH maybe not
Syd Thrift is offline   Reply With Quote
Old 01-18-2024, 07:25 PM   #17
Syd Thrift
Hall Of Famer
 
Syd Thrift's Avatar
 
Join Date: May 2004
Posts: 10,430
Here's the new version of the average that lops the leading 0 off:

SUBSTR(CAST(round(bat.h / bat.ab, 3) AS CHAR(5)), 2) as avg

One thing I haaaaaaaaaate about SQL scripting is that it's next to impossible to debug. And to be fair what's going on in this line would be a pain to debug in any language. I recommend writing these kinds of things out step by step, insuring you have the result you're looking for inside-out each step of the way. Going from inside out then...

1. Divide hits by at-bats. This gives you a number that could very well be .235737532987593120578375239 or something huge like that.

2. Round off to the 3rd digit. it sure looks to me like SQL exports all 3 decimals even when it doesn't have to; a .300 doesn't export as .3 for example.

3. Cast this number as a 5-unit character. Including the decimal, BA will always be 5 characters long.

4. Get the substring of this, beginning with character number... 2. As a developer I tend to think about strings in terms of them being an array of characters and, since arrays are 0-based, you'd start with position 1. But nope, MySQL counts from 1, unlike, like, all other programming languages on the planet.
__________________
Quote:
Originally Posted by Markus Heinsohn
You bastard....
The Great American Baseball Thrift Book - Like reading the Sporting News from back in the day, only with fake players. REAL LIFE DRAMA THOUGH maybe not

Last edited by Syd Thrift; 01-18-2024 at 07:31 PM.
Syd Thrift is offline   Reply With Quote
Old 02-20-2024, 08:29 PM   #18
fhomess
Hall Of Famer
 
fhomess's Avatar
 
Join Date: Nov 2002
Posts: 3,588
Be aware that innnings in both the pitching and fielding tables is divided into IP and IPF, where IP is whole innings and IPF is fractional innings. To get an accurate IP for ERA calculation, you need to do (3*IP+IPF)/3.

Also, don't forget to consider denominators that could be 0. For example, in batting average, h/ab can result in an error if ab=0. Not going to be an issue for end of year team stats, but individual players with very little playing time might run into that.
__________________
StatsLab- PHP/MySQL based utilities for Online Leagues
Baseball Cards - Full list of known templates and documentation on card development.
fhomess is offline   Reply With Quote
Old 02-20-2024, 10:12 PM   #19
Syd Thrift
Hall Of Famer
 
Syd Thrift's Avatar
 
Join Date: May 2004
Posts: 10,430
True, although to the first point, if memory serves ERA for the purpose of winning the title didn’t figure partial innings until some point in the 80s when I think Steve McCatty lost an ERA title because of this. So I’m RIGHT ON PACE with the 70s.

In the rare instance where those percentages tried to run something with no ABs or IPs, that field just returns null, which can be an issue if I was still doing all this via a script but when I’m inputting individual player info and only selecting guys who reach a certain threshold of at bats, those results are very rare. It was much more common that I had an extra line with nulls in it for the year, team, and league because I didn’t exclude college stats.
__________________
Quote:
Originally Posted by Markus Heinsohn
You bastard....
The Great American Baseball Thrift Book - Like reading the Sporting News from back in the day, only with fake players. REAL LIFE DRAMA THOUGH maybe not
Syd Thrift is offline   Reply With Quote
Old 02-21-2024, 01:13 PM   #20
fhomess
Hall Of Famer
 
fhomess's Avatar
 
Join Date: Nov 2002
Posts: 3,588
Quote:
Originally Posted by Syd Thrift View Post
True, although to the first point, if memory serves ERA for the purpose of winning the title didn’t figure partial innings until some point in the 80s when I think Steve McCatty lost an ERA title because of this. So I’m RIGHT ON PACE with the 70s.
I had no idea about this. Fascinating!
__________________
StatsLab- PHP/MySQL based utilities for Online Leagues
Baseball Cards - Full list of known templates and documentation on card development.
fhomess is offline   Reply With Quote
Reply

Bookmarks


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -4. The time now is 08:56 AM.

 

Major League and Minor League Baseball trademarks and copyrights are used with permission of Major League Baseball. Visit MLB.com and MiLB.com.

Officially Licensed Product – MLB Players, Inc.

Out of the Park Baseball is a registered trademark of Out of the Park Developments GmbH & Co. KG

Google Play is a trademark of Google Inc.

Apple, iPhone, iPod touch and iPad are trademarks of Apple Inc., registered in the U.S. and other countries.

COPYRIGHT © 2023 OUT OF THE PARK DEVELOPMENTS. ALL RIGHTS RESERVED.

 

Powered by vBulletin® Version 3.8.10
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
Copyright © 2024 Out of the Park Developments