|
||||
|
01-01-2013, 08:45 AM | #21 |
Minors (Triple A)
Join Date: Dec 2012
Location: Plzeň, Czech Republic
Posts: 296
|
Here is what I did, if you want you can check it.
|
01-01-2013, 09:05 AM | #22 |
Bat Boy
Join Date: Dec 2012
Posts: 5
|
no way: I'll PM you some surnames really quick, hope it helps
edit: message sent, hope it arrived - I don't have it in my mailbox for some reason. Last edited by Kebabking; 01-01-2013 at 09:39 AM. |
01-01-2013, 09:36 AM | #23 | |
Hall Of Famer
Join Date: Nov 2012
Location: Czech Republic
Posts: 2,077
|
Quote:
|
|
01-01-2013, 10:12 AM | #24 | |
Hall Of Famer
Join Date: Nov 2012
Location: Czech Republic
Posts: 2,077
|
Quote:
1) What encoding the file with the Czech special characters included should have? I suppose it's UTF-8 but I rather ask... 2) Where did you get those names? There are some (e.g. Jeremy, Georg, Nikolas etc.) I don't consider Czech although I admit there could be a Czech man using such name - but it would be _very_ rare. Is it possible to delete names from the file to make it more accurate? 3) Is it possible to add new names to the files? I missed some names after a brief scan of the files (like Miroslav or Radovan). 4) I suppose the first number after a name expresses how often the name should occur. Is that right? Did you get those numbers from some statistics? There are some which I feel they are not right (like Jaromír or Josef) although I don't have any real numbers in my hands now. I could probably find and use some statistics to correct the numbers. Is such effort desired? 5) What does the second number express? I'd like to improve at least the first names file since I think the first names are more annoying if they are done wrong. There are quite enough surnames so they won't repeat so much but if there is a problem with first names' frequency and you see a lot of lets say Rostislavs (which is not so much common Czech name) you will be disappointed. My plan is to add more names to the file, take some name frequency statistics and correct the freq numbers accordingly. I must admit though that I think I won't make it all done today. Would you like to see such work done or are you not interested in this? Thanks. Last edited by geckon; 01-01-2013 at 10:13 AM. |
|
01-01-2013, 12:58 PM | #25 |
Hall Of Famer
Join Date: Nov 2012
Location: Czech Republic
Posts: 2,077
|
Time for a little update and few more questions
I searched for Czech name databases and found some. I have three possible plans. Please tell me which one you find better for the game. 1) Use the database provided by Czech ministry for internal affairs. It contains all names used in Czech Republic and can be sorted according to frequency of newborns in any year since 1890. So I could pick one year and take lets say 200 most used names that year for newly born children. Advantages: - Old names used lets say fifty years ago won't be included (or less often) although people with such names still live. Disadvantages: - Statistical error. Let's say that in the picked year there were uncommonly few Jans born although normally this name is very common. The resulting file would be affected by such abnormality. - I would have to sort female names out by hand since the database contains male and female names together. 2) Use another database which is probably based on the same or similar data but shows only overall numbers of living persons with given names. Advantages: - It doesn't have the statistical problem described above. - I can filter out only male names automatically so less work for me Disadvantages: - It includes older names that are not used much nowadays. - It includes some Slovak names which is I think mostly reflecting Slovaks who stayed in Czech Republic after Czechoslovakia split-up (so no the people born after 1993). 3) Just take the file given by JeffR and edited by no way and edit it more - make some corrections, add missing names, delete few names etc. Advantages: - Probably the fastest option. (Maybe the 2nd one could be faster, hard to say.) Disadvantages: - Certainly not as accurate as the others. Questions: - (Valid only for the first plan) What year to pick? Since I suppose the first newgens will be generated for the 2013/2014 season it should probably be something like 1990-2000. Do you agree? Which one would you pick and why? - (For developers only) What range is used for the name frequency in the name generator? I need to know that to be able to normalize the real frequency numbers. - How many names would you include into that file? I personally prefer probably the second option for it should probably be both fast and precise enough. Last edited by geckon; 01-01-2013 at 01:26 PM. |
01-01-2013, 09:12 PM | #26 | ||||||||||
FHM Producer
Join Date: Apr 2002
Location: Kelowna, BC
Posts: 16,623
|
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Thanks for the help. Again, take your time, it doesn't take long for us to incorporate them into the names file once we have them. |
||||||||||
01-01-2013, 09:18 PM | #27 |
FHM Producer
Join Date: Apr 2002
Location: Kelowna, BC
Posts: 16,623
|
Sure, I don't have them done for Ukraine/Belarus yet (I was planning on separate files for all 3), but here's Russia. Unfortunately, it's a mess - I made the mistake of adding in the OOTP Russian names, which are terrible - feminine names, lots of Central Asian names, and so on. I've tried to clean it up, but from what Alessandro tells me, I missed a lot. So if you can get rid of some of the obvious problems, that would be great.
|
01-01-2013, 09:28 PM | #28 |
FHM Producer
Join Date: Apr 2002
Location: Kelowna, BC
Posts: 16,623
|
And as long as I'm asking for help, I might as well post this: here's the (provisional) ethnicities.txt file for the game. The ones marked with asterisks are already done (and a few others will probably be lifted entirely from OOTP, when their names are good enough for our purposes, e.g. Latin American and Japanese names), but the others need work. If anyone wants to help with any of these, speak up. Some of them will need to be built from the ground up, but for most I can provide you with beginning files built from the player database.
|
01-02-2013, 04:52 AM | #29 | |
Minors (Rookie Ball)
Join Date: Feb 2012
Posts: 36
|
Quote:
Now the last names - wow. It is going to take me some time to go over 5000 names, I'd guess maybe 1000 are Central Asian. And while yes you have your Uzbek here and there that is a Russian who plays hockey, his name isn't going to be "Ivan." So with the way the engine selects names I think it's best to leave them out of the Russian name file. |
|
01-02-2013, 06:45 AM | #30 | |
Minors (Single A)
Join Date: Jun 2012
Posts: 87
|
Quote:
I suppose the obvious thing would be to do the English ones (given I am from England) but if any really random sets need building from the ground up and you can't find anyone with any expertise I don't mind having a go. |
|
01-02-2013, 07:45 AM | #31 | ||||
Hall Of Famer
Join Date: Nov 2012
Location: Czech Republic
Posts: 2,077
|
Quote:
Quote:
The possibility of Czech player getting Slovak name seems cool to me. Would it be possible (and usable) something like a very very little chance of getting any language's name for a newgen with any nationality? Name files could be purely national and you could still have Jeremy Tichý this way. Quote:
If there will be Harry Potter Novák (or similar) as a Czech name in the statistics I promise I will remove it ;-)) (Btw any possibility to have women hockey? Not that I would need it, it just crossed my mind :-)) Quote:
OK, I'll try to do it the most properly when there is enough time. If you would need it for some reason just tell me and I will try to do it faster. |
||||
01-02-2013, 08:44 AM | #32 |
Minors (Triple A)
Join Date: Dec 2012
Location: Plzeň, Czech Republic
Posts: 296
|
To geckon:
if you want some hlep, just contact me and I'll do whatever I can or you need |
01-02-2013, 08:54 AM | #33 |
Hall Of Famer
Join Date: Nov 2012
Location: Czech Republic
Posts: 2,077
|
|
01-02-2013, 09:03 AM | #34 | |
All Star Starter
Join Date: Oct 2011
Location: UK
Posts: 1,209
|
Quote:
In case it is of any help, I have attached a spreadsheet of British players broken down by nationality (i.e. England, Scotland, etc). This covers players in the EIHL, EPL and NIHL-1. Maybe this might be useful to anybody who looks at the British names (please note that I don't have any names from the Rep of Ire - hence that worksheet is blank). Some other name resources: SCOTLAND - Common male Scottish forenames: Scottish First Names: Scots names -Women - Common Scottish surnames: Appendix:Scottish surnames - Wiktionary - Top 100 Scottish surnames (2001 Scottish census): http://www.gro-scotland.gov.uk/files...es_tablea1.pdf - Top 300 Scottish surnames (1901 Scottish census): http://www.gro-scotland.gov.uk/files...es_tablea5.pdf ENGLAND & WALES - Top 100 English male baby names & top 100 Welsh baby names (2011 census - see Table 2 for England and Table 3 for Wales - Table 6 has a list of the top 6083 boys names for England and Wales and lists their frequency): http://www.ons.gov.uk/ons/rel/vsob1/...boys--2011.xls - Common English & Welsh surnames: http://en.wiktionary.org/wiki/Append...land_and_Wales) OTHER - Common surnames for various other countries: Category:Surname appendices - Wiktionary
__________________
Webmaster of The Blue Line Eastside Hockey Manager & Franchise Hockey Manager community and resource Last edited by archibalduk; 01-02-2013 at 09:04 AM. |
|
01-02-2013, 11:41 AM | #35 |
Minors (Single A)
Join Date: Jun 2012
Posts: 87
|
Ok. I will try some of the British or Irish name files if the ootp American names file is available.
|
01-02-2013, 01:48 PM | #36 | |
Minors (Single A)
Join Date: Dec 2012
Location: Gatineau (Ottawa Region)
Posts: 70
|
Quote:
Mother tongue (percentage distribution), Canada, provinces and territories, 2011 Census Richard, Lafleur, Lemieux, Roy, Beliveau, Plante, Bourque, Trottier, Bossy, Potvin, Dionne, Perreault, Brodeur, Fleury, Marleau, Bričre, Giroux, Bergeron etc.. |
|
01-02-2013, 02:18 PM | #37 | |
All Star Starter
Join Date: Jan 2003
Posts: 1,957
|
Quote:
|
|
01-02-2013, 03:54 PM | #38 | |
Minors (Double A)
Join Date: Apr 2009
Location: Chase BC Canada
Posts: 105
|
Quote:
|
|
01-02-2013, 04:28 PM | #39 | |
Minors (Single A)
Join Date: Dec 2012
Location: Gatineau (Ottawa Region)
Posts: 70
|
Quote:
Denis and Jean Potvin Ottawa, Ont Patrick Marleau, Aneroid SK (Don't know if he speaks french?) Claude Giroux Hearst, Ont Alex Plante Brandon MB Well five on 18 is about 27% out of Quebec ! I think it's fair to say that there's enough french name in North America even in USA but it doesn't mean they speak french. |
|
01-02-2013, 04:53 PM | #40 |
Major Leagues
Join Date: Oct 2012
Posts: 320
|
The post you replied to initially referred to players "born in Ontario or western Canada"
Your response indicating 21.7% French didn't refer to western Canada and Ontario but rather all of Canada The data you linked to indicates 1.5%-4.1% French as mother tongue in western Canada and Ontario Last edited by Nino33; 01-02-2013 at 04:54 PM. |
Bookmarks |
Thread Tools | |
|
|