Quote:
|
Originally Posted by dougaiton
If it's any help, ICC2005 (a cricket sim) has name files with some 70-120 surnames for Indian and Pakistani batsmen, as well as Zimbabwean and South African surnames (being very different to American surnames). They are pretty sparse but would make a nice starting point.
One thing I was thinking about was that FM (and thus SI) must have a huge databank of 'names'? Has Marc Duffy said anything about getting access to these?
|
Quote:
|
Originally Posted by battists
I'm already in touch with Marc about potentially getting access to the FM names data, but no results yet. Good suggestion though.
Is ICC2005 a SI product? Thanks for that lead. I am trying to get as many nationalities as possible represented, so I may try to track those down.
Just FYI, if I can't get a decent set of data for a given nationality (like, say, Pakistani), I'll probably exclude them, since no one wants a league with a hundred identical player names.
|
http://content-uk.cricinfo.com/ci/co...yer/index.html
Every international cricketer (and cricket official) ever, from every cricket playing nation; 70+ years for India, 50+ years for Pakistan. Also South Africa (Boer names mainly), West Indies, etc. as well as smaller quantities for Bangladesh, Sri Lanka and a few African nations, etc.
Will take a little work to pull the names together and obviously it includes every player, so lots of repeats of the more common names, but that can be used to give frequencies. Also, about 20% of the players appear to only have a first initial, but the surnames can be used.
I'll tinker with it over the next couple of days if there's an interest.