Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode Issue #77

Closed
seanzamora opened this issue Oct 30, 2019 · 8 comments
Closed

Unicode Issue #77

seanzamora opened this issue Oct 30, 2019 · 8 comments
Labels
duplicate An issue already exists covering the same or substantially overlapping problem

Comments

@seanzamora
Copy link

It appears luaparse is having trouble parsing non-English characters such as í. I implemented a temporary fix but I thought I should let you know. If I have time to prepare a permanent solution ill initiate a pull request.

Error:

SyntaxError: [1:240] unexpected symbol 'í' near 'Palad'

@fstirlitz
Copy link
Owner

fstirlitz commented Oct 31, 2019

What is the input?

Do you happen to be using the command line tool?

@seanzamora
Copy link
Author

Input: ["Class"] = Paladín

No, library in NodeJS application.

@fstirlitz
Copy link
Owner

fstirlitz commented Oct 31, 2019

Would have appreciated a fuller sample, not one line stripped of any context.

But anyway, this is expected; identifiers in PUC-Rio Lua can only consist of ASCII characters. Latest git master adds the extendedIdentifiers option that also allows Unicode characters outside the Basic Latin block, to cover LuaJIT. Exact semantics are yet to be decided upon and stabilised.

Closing this as a duplicate of #53. Please follow that issue for more information.

@fstirlitz fstirlitz added the duplicate An issue already exists covering the same or substantially overlapping problem label Oct 31, 2019
@seanzamora
Copy link
Author

Hey thanks for the info. The sample i provided was complete. : D

@fstirlitz
Copy link
Owner

In that case it still shouldn't parse, because it's not a valid Lua statement. I assumed it was part of a table constructor expression; there it would at least have made some sense.

@seanzamora
Copy link
Author

Oh yeah i apologize for the confusion my script auto appends the appropriate table structure:

DB= {
["Class"] = "Paladín",
}

Thanks for the above solution appreciate it.

@seanzamora
Copy link
Author

Hey fstirlitz,

Thanks for suggesting turning on extendedIdentifiers, it did help resolve my issue. I did identify a possible problem. Not sure if it is.

fixupHighCharacters() method seems to be obscuring Cyrillic characters (Russian characters). I read the section "Note on character encodings" and it was very informative and was related to the problem I was running into. I ended up overriding the method and simply returned the original string.

Can you elaborate on this I would appreciate it?

Thank in advance.

@fstirlitz
Copy link
Owner

I'm not sure what you mean, but follow #68 for future directions. Semantics are liable to change before the release.

@fstirlitz fstirlitz closed this as not planned Won't fix, can't repro, duplicate, stale May 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate An issue already exists covering the same or substantially overlapping problem
Projects
None yet
Development

No branches or pull requests

2 participants