You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
metadata output seems to be in ascii with other unicode characters encoded as numerical character entities. Legal for default utf8 encoding, as ascii is a subset, but this is not what I, and I think most people want or expect.
( This may be the same issue reported as #32 . This was also reported to me by Columbia.edu and I was able to reproduce it on both my and their OAI feeds. )
I initially tried adding encoding="UTF-8" to etree.tostring call in metadata.py but this worked under python3.x, but failed under python2.x .
adding encoding="unicode" appears to be the correct fix that seems to work under both python2.x and python3.x .
Under python2.x , encoding="UTF-8" returns a <type "str"> that contains unicode characters, which then may give an error when coercing to <type "unicode"> . encoding="unicode" returns <type "unicode"> .
metadata output seems to be in ascii with other unicode characters encoded as numerical character entities. Legal for default utf8 encoding, as ascii is a subset, but this is not what I, and I think most people want or expect.
( This may be the same issue reported as #32 . This was also reported to me by Columbia.edu and I was able to reproduce it on both my and their OAI feeds. )
I initially tried adding
encoding="UTF-8"
to etree.tostring call in metadata.py but this worked under python3.x, but failed under python2.x .adding
encoding="unicode"
appears to be the correct fix that seems to work under both python2.x and python3.x .Under python2.x ,
encoding="UTF-8"
returns a<type "str">
that contains unicode characters, which then may give an error when coercing to<type "unicode">
.encoding="unicode"
returns<type "unicode">
.See: https://github.com/sdm7g/oai-harvest/blob/fix-pyoai/oaiharvest/metadata.py#L51-L53
The text was updated successfully, but these errors were encountered: