-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unicode bugs #15
Comments
I'm not entirely sure, because it does work with certain characters (for example été, π, λ). I'm not sure what's the difference with your string. All the indices should be in characters provided Ruby knows the proper encoding of the string, since string manipulation functions are character based as of 1.9. In your case, the problematic characters seem to be ︵ヽノ︵. |
Hmm! Okay, thanks for testing. I've narrowed the problem down -- for some reason, when I run a script as an executable using This is a bit weird -- I'm not really sure whose fault this is. :) |
Wait.. OMG, this is so weird. Okay, so, it has nothing to do with the script being executed like a binary. When I first run readline, the encoding is UTF8. When I paste " ︵ヽノ︵", the encoding becomes ASCII-8BIT. Something stinks here. :) Here's my test script (Alt-E prints encoding/length): #!/usr/bin/env ruby
require 'coolline'
cool = Coolline.new do |c|
c.bind "\ee" do |c2|
p [c2.line.size, c2.line.encoding]
end
end
cool.readline |
I suspect the problems happen at insertion time. For example, maybe the character doesn't get inserted in one go, and when we insert part of it, the string becomes invalid as UTF-8 and the encoding gets changed. Oddly enough, here, after pasting the same string, I get the right position and UTF-8 as an encoding, editing works, but the cursor is definitly not rendered at the right position (it appears one line below, one character to the left). |
Oh man, that's weird. Now I'm getting your behaviour. Everything stays UTF8, but I get new lines. |
It only happens with double-wide characters, it seems. ノ is fine, ︵ prints a new line. |
After poking around with ANSI cursor positioning and double-wide UTF8 characters, it appears that they actually take up 2 columns on the display. For example, if you print "ab︵c", then position the cursor on the screen using ANSI codes, the column of each character is as follows: a = 1 I'm still stumped as to why it's adding a linefeed. |
I just tested out editing UTF-8 in coolline, and it doesn't seem to work properly.
@pos
appears to be counting bytes, not chars, which puts the cursor way off in space. Predictably, backspace removes a byte at a time. I assume everything will exhibit this behaviour. :)Here's a good piece of Unicode for testing: ┻━┻ ︵ヽ(`Д´)ノ︵ ┻━┻
The text was updated successfully, but these errors were encountered: