Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

supporting arabic data generation #86

Closed
walaa-akram opened this issue Mar 25, 2024 · 9 comments
Closed

supporting arabic data generation #86

walaa-akram opened this issue Mar 25, 2024 · 9 comments

Comments

@walaa-akram
Copy link

Are you willing to support arabic generated data?

@emanlove
Copy link
Member

@walaa-akram Yes, we definitely are. I'll note I have no expertise and just a very little knowledge of Arabic so we would need others to share their knowledge and language skills. I am willing to work with you and others to help bring in Arabic .. which as I type this I realize we may need to do this down at the Faker Python package level.

[I know Arabic is one of the Right-to-Left Languages. Although as I double check myself I see others talking about nuances I was not aware of. I know one can use, what I would call "spaces" but are (?), kashidas to justify or align the letters giving almost a monospace or fixed width type font (which I find to be super cool). I was, a couple years ago, going through some video tutorials learning a very introductory level of Arabic lettering. And finally I have an interest in tooling around Robot Framework and RTL languages that although dormant at times keeps coming back. So right to left languages are an interest of mine.]

@emanlove
Copy link
Member

emanlove commented Mar 26, 2024

Sorry didn't have time yesterday but went in to check just now. There are some pieces of data that the underlying Faker library already provides for Arabic. I found these

./faker/providers/automotive/ar_BH
./faker/providers/automotive/ar_JO
./faker/providers/automotive/ar_PS
./faker/providers/automotive/ar_SA
./faker/providers/color/ar_PS
./faker/providers/date_time/ar_AA
./faker/providers/date_time/ar_EG
./faker/providers/internet/ar_AA
./faker/providers/job/ar_AA
./faker/providers/lorem/ar_AA
./faker/providers/person/ar_AA
./faker/providers/person/ar_PS
./faker/providers/person/ar_SA
./faker/providers/phone_number/ar_AE
./faker/providers/phone_number/ar_JO
./faker/providers/phone_number/ar_PS

A sample usage of this within Robot Framework would be

*** Settings ***
Library    FakerLibrary    locale=ar_AA

*** Test Cases ***
FakerLibrary Name (Person) Generation
    ${name}=    FakerLibrary.Name
    Log To Console   \nname: ${name}

I don't know how good the Arabic data is. For example I know some Germans note that the "Faker German Names" aren't very German. Also I see there is not a lot of Arabic data files as shown above. I suspect but have tried this myself yet that one can add their own files and data as a provider expanding upon what comes with the packaged data.

@walaa-akram
Copy link
Author

Dear Emanlove

I tried it and it works. as you said the quality of names is not the proposed one but it is ok till now.
I need to ask what if I want to generate data with different locals on the same test case then how I will do that?
Also, How I specify that generated user name shouldn't contain special characters?
Would you pleas add more documentation and details to your methods?

@emanlove
Copy link
Member

emanlove commented Mar 30, 2024

The addition of data is done through the underlying Faker python package. It has documentation which can be found here. In particular as I recall this addition of data is done through what they call "Providers". Its been a while since I have taken a deep look into this so I may be a bit wrong there. The Faker documentation would be a good place to start. I am coming back from the Robot Framework conference where we had conversations on what we might call FakerLibrary 2.0. So I hope to be getting a much better understanding over the next couple months. Please feel free to come back and ask more questions after you get a chance to see what there is in the faker documentation. Or if you want to discuss sooner for clarifying purposes.

@emanlove
Copy link
Member

@walaa-akram I realized as I re-read your question I didn't really address what you asked. If you want to use two languages in the same script you could currently do something like

*** Settings ***
Library    FakerLibrary    locale=ar_AA  AS   Arabic
Library    FakerLibrary    locale=fr_FR  AS   French


*** Test Cases ***
FakerLibrary Name (Person) Generation
    ${a_name}=    Arabic.Name
    ${f_name}=    French.Name
    Log To Console   \nArabic name: ${a_name}
    Log To Console   \nFrench name: ${f_name}

where you import a faker library setting the locale you want and reassign the library name, using the <import library> AS <another_name>. This would allow you to have multiple languages. This is one of the issues we wanted to address within this FakerLibrary 2,.0 to make it easier to switch locales or have specific "contexts".

As for a name not having any special characters for now I would assume you would need to remove them from the provider dataset.

@emanlove
Copy link
Member

emanlove commented Mar 30, 2024

And as for adding providers I am seeing there isn't a way to do this through FakerLibrary as currently implemented. One can change the underlying code if you have an immediate need. I don't think such additional functionality needs to wait for FakerLibrary next generation. (I realize saying "FakerLibrary 2.0" could be confusing as the current version is 5.0. So maybe next generation is a better term to use here). That might be a next release type of change.

I take that back. As I am making some changes I see it is possible to add providers as is. Working out how to do that and will document!

@emanlove
Copy link
Member

So I know this needs to be explained but I did want to share a sample here. Please try not to get frustrated if this is not clear as this is more preserving as source control and not explaining fully what I am doing yet.

I created a basic provider in Python and then was able to bring it into the robot script through the providers argument on the import library. As I needed to pass the providers as a list, as currently implemented, I felt the only method for this was to import the library using the Import Library keyword instead of using the *** Settings *** section.

cake_provider.py

# Cakes Provider
from faker.providers import BaseProvider

class Provider(BaseProvider):
    def cakes(self) -> str:
        return 'German Chocolate Cake' 

add-provider.robot

*** Test Cases ***
Try Out Adding A Provider Through FakerLibrary
    @{providers}=  Create List  cake_provider
    Import Library    FakerLibrary    providers=${providers}
    ${a_cake}=  FakerLibrary.Cakes
    Log To Console    \n${a_cake}

@walaa-akram
Copy link
Author

Thanks for replying supporting different locals question and it was helpful.
My Second question is:
In robot framework faker documentation you allow generating usernames but without any arguments for example: allowing the user to choose if the generated userName supports special characters, numbers, Capital letters as boolean arguments so it would be a great addition in Faker library version 2
image

@emanlove
Copy link
Member

emanlove commented Apr 1, 2024

Thanks for pointing this out. I created a issue [#88] to specifically look into this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants