Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to ignore diacritics (accents) #7

Open
euu2021 opened this issue Aug 10, 2022 · 6 comments
Open

Option to ignore diacritics (accents) #7

euu2021 opened this issue Aug 10, 2022 · 6 comments
Labels
enhancement New feature or request

Comments

@euu2021
Copy link
Contributor

euu2021 commented Aug 10, 2022

Option to ignore diacritics (accents).

@euu2021
Copy link
Contributor Author

euu2021 commented Aug 10, 2022

I noticed that you asked feature requests to be posted at #1

@euu2021 euu2021 closed this as completed Aug 10, 2022
@lilive
Copy link
Owner

lilive commented Feb 7, 2023

I noticed that you asked feature requests to be posted at #1

Thank you for your attention. But now it seems to me that opening issues for feature requests is a convenient practice. Let's do that 😉 .

Option to ignore diacritics (accents).

Thanks for your suggestion. I did some researches and it seem possible. But it will imply some quite important changes in Jumper code. Could you tell me more about your use case ? I'm currently working a little on Jumper, and your use case may give me the motivation boost I need 😄

@lilive lilive reopened this Feb 7, 2023
@lilive lilive added the enhancement New feature or request label Feb 7, 2023
@euu2021
Copy link
Contributor Author

euu2021 commented Feb 9, 2023

Interesting. I thought it would be simply activating a flag, like for the case sensitive search.

Well, the "ignore diacritics" is a convenience for users of languages that have accents, because sometimes the user has the content without the accent, when that word actually should have the accent. This may happen for some reasons (all of them apply to me):

  • laziness (typing accents is a bit annoying)
  • hurry when typing (typing accents makes you slower)
  • the user has a keyboard with English layout that makes it even more inconvenient to use accents, which aggravates the two problems above (the user, now, regrets buying that keyboard)
  • plain ignorance of what is the correct orthography. There are no clear rules for when accents should be used or not. I tried to complain to the Portuguese language developers, but they don't seem to care
  • avoiding compatibility problems with English language (for example, commands in Java can't have accents). More than once, I've seen software completely breaking for the sole reason that my username has an accent.

So, when searching, the user needs to perform two searches, to make sure he covered both possibilities of the word appearance.

Thanks,
Long Life To The Jumper Add-on!!

@lilive
Copy link
Owner

lilive commented Feb 9, 2023

Thank you for this answer. You've made your point clear, and I agree this feature could be useful.

the user, now, regrets buying that keyboard

😆

I thought it would be simply activating a flag, like for the case sensitive search.

I wish ! But this isn't the case. Jumper use regular expressions to compare the node text and the search terms, and they have no option to ignore the accents. So, Jumper would have to create a non accentuated version of the text, before to do the regular expression search.

You wrote :

I hope in the future I will be skilled enough to implement them

so I share with you, FYI, some details about Jumper and what changes should be done to implement "an accent insensitive search" (AIS for the rest of this post):

For now, because map nodes may contain HTML or markdown formatting, Jumper generate, when it start, a "clone" of the map, with nodes in plain text, without formatting. Then, Jumper search in this cloned map.
If we want AIS, the clone map should also contain the nodes text without accents. Jumper could generate the non accentuated text at the same time it generate the plain text, but this will slow down it for something useless if the user do not want to use AIS. And vice versa, if the user only want AIS, it is useless to generate plain text in the clone map.
Jumper already has performance issues with large maps, this is why I'm careful about performances.
A solution would be to generate the text in the clone nodes at the same time they are scanned. If the user ask for AIS, the non accentuated text would be created, and if not, the plain text would be created. And both these texts should be retained for the next search (because each node is scanned multiple times in a single Jumper invocation, as the user type or modify the search options).
After reflection, this is not so much changes. There is more, if I want Jumper to use the time between the moment the user invoke it and the moment the user start to type, to start to generate the plain (or unaccentuated) text. But it may not be so difficult.

BTW: if, at some point, you are curious about Jumper code and want further explanations (there is already some comments in the code, but nor enough, I guess), you're welcome to ask me.

@euu2021
Copy link
Contributor Author

euu2021 commented Feb 12, 2023

Interesting. Thanks for the explanation. Yes, I was expecting that it would slow down the search a bit. In the Anki Flashcards software the AIS slows down the search by a split of a second (but I keep it on, anyway, because it gives me peace of mind). They even warn about it in the settings:
image

BTW: if, at some point, you are curious about Jumper code and want further explanations (there is already some comments in the code, but nor enough, I guess), you're welcome to ask me.

Thanks. Right now, I'm learning about how the scripting works in the Freeplane code, so I can understand how to create more powerful scripts beyond my usual 5-liner script (I'm very amateur). After that, I will try to learn with the Jumper code. But, I think that it is going to take months until I acquire the necessary skills.

@euu2021
Copy link
Contributor Author

euu2021 commented Feb 13, 2023

I was just using Jumper now, and noticed the situation which actually may be the most relevant for the use of "ignore accents": sometimes the user have in mind a group of similar words to be searched using the same radical, but, for some reason, some of them have accents in the radical.

This was my situation: I wanted to search for words like "compete"; "competência"; "competente". So, the ideal would be searching for "compete", which matches all three, if "ignore accents" is activated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants