Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evalution for StereoSet (Bias) #34

Closed
roskoN opened this issue Aug 30, 2022 · 2 comments
Closed

Evalution for StereoSet (Bias) #34

roskoN opened this issue Aug 30, 2022 · 2 comments
Assignees
Labels

Comments

@roskoN
Copy link

roskoN commented Aug 30, 2022

Hey @malteos, Hey @sasaadi

At Alexander Thamm, we worked on an evaluation procedure for bias of language models.

It is based on StereoSet. Originally, the benchmark is for English only. However, we translated the dataset into German using automatic translation (Amazon Translate). We did a comparative study with multilingual models on both the English and German versions of StereoSet and confirmed there are no big differences. Hence, we can use it for also evaluating German LMs.

Would it make sense for you that we integrate it into the LM evaluation harness? I will be happy to open a PR, but before doing that, I wanted to align with you.

What do you think? Do you have any questions or comments?

I also had a discussion with @mali-git, he is on board with the idea.

Thanks,
Rosko

@roskoN roskoN added the task label Aug 30, 2022
@roskoN roskoN self-assigned this Aug 30, 2022
@malteos
Copy link

malteos commented Aug 30, 2022

Hey @roskoN, great initiative! A bias task would be a valuable addition to the framework. Let me know if you need any help. I've written this little guide on how to add new tasks: #2

@malteos
Copy link

malteos commented Nov 21, 2022

Added via #35

@malteos malteos closed this as completed Nov 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants