Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Username suggestion for non-English names #35984

Open
CodeWithEmad opened this issue Dec 7, 2024 · 4 comments · May be fixed by #36056
Open

Username suggestion for non-English names #35984

CodeWithEmad opened this issue Dec 7, 2024 · 4 comments · May be fixed by #36056

Comments

@CodeWithEmad
Copy link
Member

While a user tries to register, Authentication MFE sends a request to LMS_HOST/api/user/v1/validation/registration to suggest 3 usernames but usernames can only contain letters (A-Z, a-z), but, the suggested name does not follow this rule for non-English names.

image

The generate_username_suggestions function in edx-platform/openedx/core/djangoapps/user_authn/utils.py is responsible for generating these names.

def generate_username_suggestions(name):
""" Generate 3 available username suggestions """
username_suggestions = []
max_length = USERNAME_MAX_LENGTH
names = name.split(' ')
if names:
first_name = remove_special_characters_from_name(names[0].lower())
last_name = remove_special_characters_from_name(names[-1].lower())
if first_name != last_name and first_name:
# username combination of first and last name
suggestion = f'{first_name}{last_name}'[:max_length]
if not username_exists_or_retired(suggestion):
username_suggestions.append(suggestion)
# username is combination of first letter of first name and last name
suggestion = f'{first_name[0]}-{last_name}'[:max_length]
if not username_exists_or_retired(suggestion):
username_suggestions.append(suggestion)
if len(first_name) >= 2:
short_username = first_name[:max_length - 6] if max_length is not None else first_name
short_username = short_username.replace('_', '').replace('-', '')
int_ranges = [
{'min': 0, 'max': 9},
{'min': 10, 'max': 99},
{'min': 100, 'max': 999},
{'min': 1000, 'max': 9999},
{'min': 10000, 'max': 99999},
]
for int_range in int_ranges:
for _ in range(10):
random_int = random.randint(int_range['min'], int_range['max'])
suggestion = f'{short_username}_{random_int}'
if not username_exists_or_retired(suggestion):
username_suggestions.append(suggestion)
break
if len(username_suggestions) == 3:
break
return username_suggestions

@CodeWithEmad
Copy link
Member Author

IDK what's the best solution here.

  • We can use unidecode package
from unidecode import unidecode

name = 'عماد'  # Persian word for "Emad"
transliterated_name = unidecode(name)
print(transliterated_name)  # Output: 'mad

There are other packages like slugify but I guess they're using something similar under the hood.

  • We can check on the frontend side, and send the request only if the word consists of Aa-Zz 9-0 - _

Do you have any idea @kdmccormick? Maybe this was fixed somewhere in the platform before, but this piece was left out.

@kdmccormick
Copy link
Member

Hmm, interesting, thanks for the report @CodeWithEmad .

As a user of the platform in a non-ascii language, what would be your preferred behavior? If you enter your name in Arabic script, would you like that platform to try to suggest a phonetically similar ascii username, or would you rather it simply suggest no usernames?

@regisb
Copy link
Contributor

regisb commented Dec 10, 2024

My 2 cents: when I enter "Régis Behmo" the suggested username is "régisb" and it would make sense to convert that into "regisb". ("é" is not a supported character)

The year is 2024, and I think that the "right" fix for this would be to support unicode characters in user names.

@CodeWithEmad
Copy link
Member Author

@kdmccormick I usually enter my name in English everywhere so I never faced this issue, but I tested a couple of services like x.com and they're using a similar solution to the unidecode approach I mentioned above.
I've entered "عماد راد" which is "Emad Rad" in Persian, and the username was "mdrd1264007" (not my actual account. I'll delete this later)

@regisb I like the idea of supporting the unicode characters for username, but I'm afraid it introduces some complexity later where one small or big package/service under the hood doesn't support unicode and it breaks something.

Suggesting a phonetically similar ASCII username is a reasonable workaround for this issue, but I am open to any solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants