forked from EleutherAI/lm-evaluation-harness
-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'yaml_multilingual_tasks'
- Loading branch information
Showing
2,042 changed files
with
20,012 additions
and
25 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
# Multilingual ARC | ||
|
||
### Paper | ||
|
||
Title: `Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback` | ||
|
||
Abstract: https://arxiv.org/abs/2307.16039 | ||
|
||
A key technology for the development of large language models (LLMs) involves instruction tuning that helps align the models' responses with human expectations to realize impressive learning abilities. Two major approaches for instruction tuning characterize supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF), which are currently applied to produce the best commercial LLMs (e.g., ChatGPT). To improve the accessibility of LLMs for research and development efforts, various instruction-tuned open-source LLMs have also been introduced recently, e.g., Alpaca, Vicuna, to name a few. However, existing open-source LLMs have only been instruction-tuned for English and a few popular languages, thus hindering their impacts and accessibility to many other languages in the world. Among a few very recent work to explore instruction tuning for LLMs in multiple languages, SFT has been used as the only approach to instruction-tune LLMs for multiple languages. This has left a significant gap for fine-tuned LLMs based on RLHF in diverse languages and raised important questions on how RLHF can boost the performance of multilingual instruction tuning. To overcome this issue, we present Okapi, the first system with instruction-tuned LLMs based on RLHF for multiple languages. Okapi introduces instruction and response-ranked data in 26 diverse languages to facilitate the experiments and development of future multilingual LLM research. We also present benchmark datasets to enable the evaluation of generative LLMs in multiple languages. Our experiments demonstrate the advantages of RLHF for multilingual instruction over SFT for different base models and datasets. Our framework and resources are released at this https URL. | ||
|
||
Homepage: `https://github.com/nlp-uoregon/Okapi` | ||
|
||
|
||
### Citation | ||
|
||
``` | ||
@article{dac2023okapi, | ||
title={Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback}, | ||
author={Dac Lai, Viet and Van Nguyen, Chien and Ngo, Nghia Trung and Nguyen, Thuat and Dernoncourt, Franck and Rossi, Ryan A and Nguyen, Thien Huu}, | ||
journal={arXiv e-prints}, | ||
pages={arXiv--2307}, | ||
year={2023} | ||
} | ||
``` | ||
|
||
### Groups and Tasks | ||
|
||
#### Groups | ||
|
||
- arc_multilingual | ||
|
||
#### Tasks | ||
|
||
- `arc_{ar,bn,ca,da,de,es,eu,fr,gu,hi,hr,hu,hy,id,it,kn,ml,mr,ne,nl,pt,ro,ru,sk,sr,sv,ta,te,uk,vi,zh}` | ||
|
||
### Checklist | ||
|
||
For adding novel benchmarks/datasets to the library: | ||
* [x] Is the task an existing benchmark in the literature? | ||
* [x] Have you referenced the original paper that introduced the task? | ||
* [x] If yes, does the original paper provide a reference implementation? If so, have you checked against the reference implementation and documented how to run such a test? | ||
|
||
|
||
If other tasks on this dataset are already supported: | ||
* [ ] Is the "Main" variant of this task clearly denoted? | ||
* [ ] Have you provided a short sentence in a README on what each new variant adds / evaluates? | ||
* [ ] Have you noted which, if any, published evaluation setups are matched by this variant? |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
group: | ||
- arc_multilingual | ||
dataset_path: null | ||
dataset_name: null | ||
output_type: multiple_choice | ||
training_split: train | ||
validation_split: validation | ||
test_split: test | ||
process_docs: !function utils.process_docs | ||
doc_to_text: "query" | ||
doc_to_target: "gold" | ||
doc_to_choice: "choices" | ||
should_decontaminate: true | ||
doc_to_decontamination_query: "query" | ||
metric_list: | ||
- metric: acc | ||
aggregation: mean | ||
higher_is_better: true | ||
- metric: acc_norm | ||
aggregation: mean | ||
higher_is_better: true | ||
metadata: | ||
version: 1.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_ar | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: ar | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_bn | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: bn | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_ca | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: ca | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_da | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: da | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_de | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: de | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_es | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: es | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_eu | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: eu | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_fr | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: fr | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_gu | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: gu | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_hi | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: hi | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_hr | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: hr | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_hu | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: hu | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_hy | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: hy | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_id | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: id | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_it | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: it | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_kn | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: kn | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_ml | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: ml | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_mr | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: mr | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_ne | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: ne | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_nl | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: nl | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_pt | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: pt | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_ro | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: ro | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_ru | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: ru | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_sk | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: sk | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_sr | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: sr | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_sv | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: sv | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_ta | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: ta | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_te | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: te | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
include: _arc_yaml | ||
task: arc_uk | ||
dataset_path: alexandrainst/m_arc | ||
dataset_name: uk | ||
training_split: train | ||
validation_split: validation | ||
test_split: test |
Oops, something went wrong.