View Single Post
Old 07-31-2004, 12:34 AM   #4
Jayzone
All Star Reserve
 
Jayzone's Avatar
 
Join Date: Apr 2004
Posts: 706
Lightbulb More detailed info on the ghost player problem

dougaiton and DaveHorn,

Thanks. Well, to be honest, I didn't expect any replies like this - ones with praises in them! I have to confess that I was writing in ranting mood, so I thought maybe this thread will be less well received or outright ignored. You have truly made me feel much better now. Thanks again.

****************

Now, I have collected more info on this ghost player problem and I will share it here. The info is obtained from the 2-league, 12-team league with 30 years of simmed history. This is basically the same setup as the league in my previous league-wide rating level study for v6.01. ( http://www.ootpdevelopments.com/boar...ad.php?t=64539 ) The 'headcount' of ghost players is not done physically by searching for them, but rather it's done by finding players that showed up in CSV roster files but not in ALMANAC files. The relationship between inacurrate CSV info and ghost players was established and explained in the previous post. The following is a table for ghost player composition. The explanations of the entries are at the bottom.

Code:
(1) Table for ghost player composition

# of players (headcount)

                 Generated Ghost Percentage(%)
Position players      2043    90    4.405 %
Pitchers              1693   110    6.497 % 
All                   3736   200    5.353 %


# of PYA (player yearly appearance)

                 Generated Ghost Percentage(%)
Position players     20285   989    4.876 %
Pitchers             16946  1332    7.860 %
All                  37231  2321    6.234 %


Note: (1) 'Generated' represents league generated players from initial league
      creation and all draftees until year 29 (The draftees of current/last
      year do not count as they haven't had the chance to become a
      ghost player - they are new to the league and are with zero pro year.)

      (2) 'Ghost' represents ghost players that are players that are supposed
      to have exited the league but not properly removed from database.

      (3) 'Percentage' is just percentage ratio.

      (4) 'PYA' stands for player yearly appearance. This is defined as how
      many times that a particular player has show up (as in year count) in
      the whole group data. So, for example, a player with 10 years in league
      (from draft to retirement) will have 10 player yearly apperances in
      the data.
The next part is the the comparison between permanent/temporary ghost. As discussed in the previous post, all ghost players are temporary. However, if theeir player IDs are never recycled, they will apears as 'permanent'. Below is the comparsion between still-existing ghost players (after 30 years, considered as 'permanent ghost players here') and all ghost players ever existed. The explanations of the entries are at the bottom.

Code:
(2) Table for Permanent/Temporary ghost player composition

Ghost player number count after 30 years

                 Ghost Total Percentage(%)
Position players    31    90   34.444 %
Pitchers            47   110   42.727 %
All                 78   200   39.000 %

Note: (1) 'Ghost' is for ghost player number count after 30-year sim.
      ('permanent' ghost players)

      (2) 'Total' is for total ghotst player number count from duting 30-year
      sim period. ('all' ghost players)
As it can be clearly seen, the portion of ghost players can not be considered insignificant. Perhaps this warrants that this is a serious problem that needs to be addressed.

While fixing the ghost player problem may be an important issue, the CSV info accuracy should be of even higher priority in my opinion. I seems to be in the minority group who view CSV as an essential and indispensable tool. However, it is a really good tool when you plan to do some studies regarding player ratings or just use CSV for global rating editing/tweaking in general.

Well, that's it for the ghost player issue. Thanks for reading.

Last edited by Jayzone; 08-02-2004 at 12:54 AM. Reason: The heacount for players was incorrect and it's been fixed (percentage recalculated)
Jayzone is offline   Reply With Quote