Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address unicode issues in database #50

Open
dr-rodriguez opened this issue Aug 12, 2022 · 2 comments
Open

Address unicode issues in database #50

dr-rodriguez opened this issue Aug 12, 2022 · 2 comments
Labels
bug Something isn't working

Comments

@dr-rodriguez
Copy link
Collaborator

When working with the database there are sometimes explicit representations of unicode characters (eg, \u2212) instead of what they should be (-). We need to investigate where this is happening; it is possible the input data is wrong (and thus a matter for the SIMPLE scripts) but we can probably force an encoding on the output JSON to ensure it doesn't happen on the file representation. The problem with having a mix of unicode characters is searching exact values can be much harder.

@dr-rodriguez dr-rodriguez added the bug Something isn't working label Aug 12, 2022
@kelle
Copy link
Collaborator

kelle commented Mar 12, 2024

Relevant SO discussion: https://stackoverflow.com/questions/18337407/saving-utf-8-texts-with-json-dumps-as-utf-8-not-as-a-u-escape-sequence

Basically, JSON is not meant to be human readable! So this behavour is expected!

If we want the JSON files to render the unicode, I think we can try this:

json.dumps(<text>, ensure_ascii=False)

@kelle
Copy link
Collaborator

kelle commented Mar 12, 2024

Maybe we just add ,ensure_ascii = False) here. Not sure what will happen when it's read back in!
https://github.com/astrodbtoolkit/AstrodbKit2/blob/d3fd3a338948e57696bfd6358e3c0830cce38e78/astrodbkit2/astrodb.py#L703-L704

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants