Skip to content
This repository has been archived by the owner on May 28, 2024. It is now read-only.

Output doesn't handle non-ascii gracefully #14

Open
leondz opened this issue Jul 29, 2011 · 3 comments
Open

Output doesn't handle non-ascii gracefully #14

leondz opened this issue Jul 29, 2011 · 3 comments

Comments

@leondz
Copy link

leondz commented Jul 29, 2011

From TAC_2010_KBP_Source_Data/data/2010/wb/eng-WL-11-174596-12957493.sgm (http://pastebin.com/Wz2QKEAZ):

Traceback (most recent call last):
File "/usr/local/bin/annotate_timex", line 154, in
print str(doc)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u201c' in position 662: ordinal not in range(128)

@cnorthwood
Copy link
Owner

I think this may be down to the encoding of the terminal being ASCII only? Not entirely sure... Bugfixes may be to (re?)open stdout in utf-8 mode if it can, will investigate a bit more when I have time

@leondz
Copy link
Author

leondz commented Aug 1, 2011

Well, if you feel like it! I'm putting these up partly as a note to myself to fix them - it just seems like the best place to keep bug reports

@cnorthwood
Copy link
Owner

largely just throwing my own thoughts out there too tbh :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants