How to prepare the data is explained here.
jaster is the Japanese evaluation dataset managed by llm-jp in llm-jp-eval. llm-jp-eval/nejumi3-data is created for data preparation for Nejumi Leaderboard3.
The artifact has been registered and the paths have been already put in the default config files.
The artifact path is in configs/base-config.yaml
.
Artifact registration is required.
- Download the dataset from the above artifacts by logging on wandb multinant SaaS. or create jaster dataset by following the instruction in llm-jp-eval/nejumi3-data.
- Upload the dataset with
python3 scripts/data_uploader.upload_dataset.py -e <your wandb entity> -p <your wandb project> -d <pass of jaster dataset> -n jaster -v <version>
The data in Stability-AI/FastChat/jp-stable are used in Nejumi Leaderboard3.
The artifact has been registered and the paths have been already put in the default config files.
The artifact path is in configs/base-config.yaml
Artifact registration is required. Below, the process of registering data to wandb's Artifacts is described for reference.
python3 scripts/upload_mtbench_question.py -e <wandb/entity> -p <wandb/project> -f "your question path"
python3 scripts/upload_mtbench_prompt.py -e <wandb/entity> -p <wandb/project> -f "your prompt path"
python3 scripts/upload_mtbench_referenceanswer.py -e <wandb/entity> -p <wandb/project> -f "your reference answer path"
Please adhere to the LCTG terms of use regarding data utilization.
The artifact has been registered and the paths have been already put in the default config files.
The artifact path is in configs/base-config.yaml
Artifact registration is required.
- Please download the dataset from the artifact path in
configs/base-config.yaml
by logining on wandb multinant SaaS. When the official LCTG bench repository is released, you can download the file from there too. - Upload the dataset with
python3 scripts/data_uploader.upload_dataset.py -e <your wandb entity> -p <your wandb project> -d <pass of LCTG dataset> -n lctg
Please adhere to the JBBQ terms of use regarding data utilization.
Manual upload required for all users, because the dataset is prohibited to distribute.
- The dataset can be downloaded from JBBQ github repository.
- Upload the dataset with
python3 scripts/uploader/upload_jbbq.py -e <wandb/entity> -p <wandb/project> -d <jbbq dataset path> -n jbbq
Please adhere to the LINE Yahoo Inappropriate Speech Evaluation Dataset" terms of use regarding data utilization.