Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ß should uppercase to ẞ #550

Open
Jules-Bertholet opened this issue Dec 8, 2024 · 7 comments · May be fixed by #551
Open

ß should uppercase to ẞ #550

Jules-Bertholet opened this issue Dec 8, 2024 · 7 comments · May be fixed by #551
Labels

Comments

@Jules-Bertholet
Copy link

The German letter ß (XKB_KEY_ssharp) never appears in the middle of words; therefore, it is only capitalized in all-caps contexts. Traditionally, there was no widely accepted capital form of the letter, so SS was used instead. However, such a form was recently introduced (ẞ, XKB_KEY_Ssharp), and, as of this year, is now the primary recommended form (with SS still permitted as an alternative). Unicode’s stability guarantees mean that it cannot adopt the new mapping for its default casing rules. But xkbcommon should not be restricted by this, and should designate ß and ẞ as a case pair.

(Related discussion thread on Unicode’s mailing list: https://corp.unicode.org/pipermail/unicode/2024-November/011162.html)

@wismill
Copy link
Member

wismill commented Dec 8, 2024

However, such a form was recently introduced (ẞ, XKB_KEY_Ssharp)

There is no such keysym yet, one need to use the corresponding Unicode keysym.

Could you add the relevant direct links? The discussion thread seems quite long.

If there is no consensus, I would stick to the official Unicode position. But if Unicode is leaning to this while not changing its mappings for now (policies, etc.), we can update the mapping.

Note that we do not ship locale-specific mappings, so this should be a case mapping accepted by all German-speaking countries.

@wismill wismill added the keysyms label Dec 8, 2024
@Jules-Bertholet
Copy link
Author

Jules-Bertholet commented Dec 8, 2024

AFAICT there is no consensus on exactly how to address the issue, though the most likely resolution is a change to CLDR (tracked at https://unicode-org.atlassian.net/browse/CLDR-17624).

However, no matter how the issue ends up being resolved by Unicode, we should be making the change in xkbcommon either way. ẞ and SS are both allowable uppercase forms of ß; there is no language or dialect in which either form is considered incorrect. But the current behavior of xkbcommon is to uppercase ß to itself, and that is not correct in any language nor any part of Unicode. So no matter what, making ß uppercase to ẞ is a strict improvement.

@wismill
Copy link
Member

wismill commented Dec 8, 2024

Thanks for the link to the issue.

But the current behavior of xkbcommon is to uppercase ß to itself, and that is not correct in any language nor any part of Unicode.

From our doc:

Convert a keysym to its uppercase form.

If there is no such form, the keysym is returned unchanged.

The case mappings do come from Unicode 16.0 and there is currently no simple case mapping for ß, i.e. no one-to-one char mapping. So we return it unchanged as well.

Let's see what Unicode conclude in the linked issue. There is a lot to process.

@Jules-Bertholet
Copy link
Author

To clarify: there are several case mappings defined by Unicode.

  • The base case mappings that are part of the standard itself. This is what libxkbcommon currently uses. These mappings are subject to strict stability guarantees, to support e.g. case-insensitive filesystems. They will never adopt ß → ẞ, because doing so would break those promises. But, of course, libxkbcommon doesn’t need or want that level of stability. It’s expected that text-processing applications like libxkbcommon will want to tailor these mappings (from mailing list discussion: https://corp.unicode.org/pipermail/unicode/2024-November/011175.html).
  • A set of language-specific mappings in CLDR. These mappings are not part of the Unicode standard, are versioned separately, and don’t have the same strict stability guarantees. Currently, these are only defined for a few languages, notably Turkish and Azeri (dotted İ/dotless ı), because the Unicode standard mappings work well for most languages. Or, they used to work well, until ẞ was introduced. The discussion in the linked issue is how to adapt CLDR to account for this new development.

In terms of what behavior is useful for users: xkb_keysym_to_upper is primarily intended for implementing caps-lock. When a user enables caps-lock, it’s probably because they want to write in all caps (duh). Therefore, they very likely do not want a lowercase ß, but instead either ẞ or SS. So returning an uppercase of ẞ is far more likely to be useful than the current behavior. Especially as, if the user really wants SS and not ẞ, they will probably type it with the S key.

@wismill
Copy link
Member

wismill commented Dec 9, 2024

It’s expected that text-processing applications like libxkbcommon

libxkbcommon is not a text-processing library. But I understand your point.

When a user enables caps-lock, it’s probably because they want to write in all caps (duh).

I am not going to tolerate disdainful tone nor patronizing. And I have not forgotten your trolling in xkeyboard-config. Please behave.

@wismill wismill linked a pull request Dec 9, 2024 that will close this issue
@wismill
Copy link
Member

wismill commented Dec 9, 2024

So I read the CLDR issue to understand better the context. The ß ↔ ẞ case mapping is endorsed by the Council for German Orthography (Rat für deutsche Rechtschreibung) since 2017. As I understood this case mapping is valid for all German dialects; since only those use ß, it seems that we can introduce the new case mapping without making it locale-dependent.

@wismill
Copy link
Member

wismill commented Dec 9, 2024

FYI: libX11 seems broken too. I filed a fix for libX11.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants