Didn't see it mentioned in this thread or the older one...
I noticed that some news stories were missing words (player/personnel names, in particular). Looking at the XML, the problem for several of these cases was that the tokens were missing the % sign, so I've hunted down as many broken tokens as I could find. Here we go...
[35-2085] [subleagueabbr]
[40-1338] [batting h word]
[40-1864] [batting rbi]
[42-4582] [personname L]
[46-1916] [batting avg]
[46-5121] [batting bb]
[48-4705] [Personname L]
[51-2063] [teamname nick]
[52-4622] [batting d word]
[53-1705] [rookieaward]
[93-1120] [leagueyear]
[118-3679] [playerposition abbr]
[118-3680] [playerposition abbr]
[118-3682] [playerposition abbr]
[118-3684] [playerposition abbr]
[118-3685] [playerposition abbr]
[118-3686] [playerposition abbr]
[118-3697] [playerposition abbr]
[118-3705] [playerposition abbr]
[118-3716] [playerposition abbr]
[118-3721] [playerposition abbr]
[124-2262] [batting t word]
[124-2274] [batting r word]
[124-2333] [batting h word]
[124-2563] [batting rbi]
[125-2354] [pitching ra word]
[125-5263] [pitching ra word]
[132-3415] [personname F L]
[132-3438] [personname L]
[132-3442] [personname L]
[134-3381] [personname L]
[134-3398] [personname F L]
[172-8397] [game outs+2 order]
[209-9895] [game outs+1]
[209-9901] [game outs+1]
[360-14084] [game base1]
[380-14837] [personname l]
[380-14838] [personname l]
[380-14839] [personname l]
[380-14840] [personname l]
[380-14841] [personname l]
---
Not sure if the next one breaks anything, but I guess they're not supposed to have thoses spaces after and before the bracket, respectively:
[305-10429] [ %game fielder2]
[167-11115] [%game secondbase ]
---
An oddity:
[165-5837] "routine grounder...(nl)right side of the diamond...(nl)[%game firstbase] moves in quickly...(nl)makes the easy pickup...(nl)waves off the pitcher...(nl)and tags the base to retire [%game batter]...(nl)[%game outs+1 long] away in the [%game teamab] [fifth]."
I guess there's supposed to be a token for the current inning where [fifth] is currently?
---
Not a token problem, but they showed up in my search for square brackets, so here are various buggy (nl) tags:
[nl):
[245-14301]
[245-14316]
(nl]:
[106-2800]
[n]:
[60-4747]
(n]:
[260-12594]
(n):
[129-4354]
[166-6580]
[177-7986]
[313-10435]
[344-10708]
[344-10758]
[348-14488]
---
wrong opening bracket:
[111-4778] ]%personlink]
[173-10113] ]%game outs+1 long word]
[173-10129] ]%game outs+1 long word]
---
doubled closing bracket:
[169-11180] [%game outs+1 long]]
---
And finally, missing spaces around tokens:
[39-215] Reliever[%personname F L]
[113-4501] [%teamlink nickonly]are
[164-5858] [%game catcher]reaches
[167-11055] [%game outs+1 long]down
[167-11077] [%game batter]is
[172-8409] [%game secondbase]fires
[173-10085] [%game leftfield]ranges
Last edited by Zeyes; 06-25-2006 at 12:20 AM.
|