Dataset Preparation

How to prepare the data is explained here.

jaster

jaster is the Japanese evaluation dataset managed by llm-jp in llm-jp-eval. llm-jp-eval/nejumi3-data is created for data preparation for Nejumi Leaderboard3.

For wandb multinant SaaS users

The artifact has been registered and the paths have been already put in the default config files. The artifact path is in configs/base-config.yaml.

For wandb on-premises or dedicated cloud users

Artifact registration is required.

Download the dataset from the above artifacts by logging on wandb multinant SaaS. or create jaster dataset by following the instruction in llm-jp-eval/nejumi3-data.
Upload the dataset with

python3 scripts/data_uploader.upload_dataset.py -e <your wandb entity> -p <your wandb project> -d <pass of jaster dataset> -n jaster -v <version>

MT-Bench

The data in Stability-AI/FastChat/jp-stable are used in Nejumi Leaderboard3.

For wandb multinant SaaS users

The artifact has been registered and the paths have been already put in the default config files. The artifact path is in configs/base-config.yaml

For wandb on-premises or dedicated cloud users

Artifact registration is required. Below, the process of registering data to wandb's Artifacts is described for reference.

python3 scripts/upload_mtbench_question.py -e <wandb/entity> -p <wandb/project> -f "your question path"
python3 scripts/upload_mtbench_prompt.py -e <wandb/entity> -p <wandb/project> -f "your prompt path"
python3 scripts/upload_mtbench_referenceanswer.py -e <wandb/entity> -p <wandb/project> -f "your reference answer path"

LCTG

Please adhere to the LCTG terms of use regarding data utilization.

For wandb multinant SaaS users

The artifact has been registered and the paths have been already put in the default config files. The artifact path is in configs/base-config.yaml

For wandb on-premises or dedicated cloud users

Artifact registration is required.

Please download the dataset from the artifact path in configs/base-config.yaml by logining on wandb multinant SaaS. When the official LCTG bench repository is released, you can download the file from there too.
Upload the dataset with

python3 scripts/data_uploader.upload_dataset.py -e <your wandb entity> -p <your wandb project> -d <pass of LCTG dataset> -n lctg

JBBQ

Please adhere to the JBBQ terms of use regarding data utilization.

Manual upload required for all users, because the dataset is prohibited to distribute.

The dataset can be downloaded from JBBQ github repository.
Upload the dataset with

python3 scripts/uploader/upload_jbbq.py -e <wandb/entity> -p <wandb/project>  -d <jbbq dataset path> -n jbbq

LINE Yahoo Inappropriate Speech Evaluation Dataset

Please adhere to the LINE Yahoo Inappropriate Speech Evaluation Dataset" terms of use regarding data utilization.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Dataset Preparation

jaster

For wandb multinant SaaS users

For wandb on-premises or dedicated cloud users

MT-Bench

For wandb multinant SaaS users

For wandb on-premises or dedicated cloud users

LCTG

For wandb multinant SaaS users

For wandb on-premises or dedicated cloud users

JBBQ

LINE Yahoo Inappropriate Speech Evaluation Dataset

Files

README.md

Latest commit

History

README.md

File metadata and controls

Dataset Preparation

jaster

For wandb multinant SaaS users

For wandb on-premises or dedicated cloud users

MT-Bench

For wandb multinant SaaS users

For wandb on-premises or dedicated cloud users

LCTG

For wandb multinant SaaS users

For wandb on-premises or dedicated cloud users

JBBQ

LINE Yahoo Inappropriate Speech Evaluation Dataset