-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SString::AppendUTF8 converts 'this' to UTF16 if it is non-ascii #76581
Comments
Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis Issue DetailsI was trying to use some of the UTF8 string handling over in #76505 and noticed that
|
This is by-design. The preferred encoding for Can you provide an example on precisely what you want and how the |
The preferred UTF-16 encoding for SString should come into play only when there is encoding mismatch. For example, when one is trying to append UTF-16 string to UTF-8 string. When I have UTF-8 string and trying to append another UTF-8 string to it, the result should stay as UTF-8. There is no reason for the result to get converted to UTF-16 (what is happening here). |
I was trying to switch These are the versions I tried: Version 1 just used For version 2 I tried to avoid For version 3 I avoided Calling these three versions of It is quite unexpected to me that the first two variants are so much more expensive. The contracts add some overhead, but it also looks like the |
I agree with this. If I recall the issue is that
On this we completely agree. I very much dislike the |
SString starts as |
Well, this is what I was reporting -- it doesn't seem to. It calls SString str(SString::Utf8, "Planlægnings");
str.AppendUTF8("periode"); // 'str' is converted to UTF16 inside this call, before starting to append "periode" |
@jakobbotsch I see the issue now. This is clearly unintuitive and I agree with your expectations here. Based on how this was implemented I get the why, but it shouldn't be this way. I'll add it to my list of things to fix up. |
@AaronRobinsonMSFT can this be closed based on #76571, or is there more work required here? |
I need to get back to this. I don't believe it is fixed yet. |
I was trying to use some of the UTF8 string handling over in #76505 and noticed that
AppendUTF8
in turn calls theAppend(SString&)
overload. This overload callsEnd()
that converts non-ascii UTF8 back to Unicode. This behavior seems quite odd, I would expect appending an UTF8 string to an UTF8 string to stay in UTF8.The text was updated successfully, but these errors were encountered: