This is the official code and associated datasets for the paper titled
We provide our train/val/test datasplits in folder data/[dataset-name]
. The instruct-llama-3-8b
and gpt-4-turbo
subfolders have the sampled test-set responses from Llama-3-8B Instruct and GPT-4-Turbo respectively.
cd llama3
-
to train a single-dataset CAVE:
python train_lora.py \ --data_path "../data/imdb62/train_i2ro.csv" \ --train_batch_size 1 \ --epochs 10 \ --lora_r 128 \ --lora_alpha 256 \ --output_dir save_models
-
to train a multi-dataset CAVE:
python train_lora_combined.py \ --data_path "../data/imdb62/train_i2ro.csv,../data/blog-auth/train_i2ro.csv,../data/pan20-fanfic/train_i2ro.csv" \ --train_batch_size 1 \ --epochs 10 \ --lora_r 128 \ --lora_alpha 256 \ --output_dir save_models
-
The command below runs the model on the test set and saves the generations as a csv file in the checkpoint folder itself. The prefix argument can be used as a naming convention to identify which dataset's test set is evaluated. The csv file is best opened with pandas to ensure that the document structure and answer structure is retained.
python inference_llama3.py --model_path /path/to/model/checkpoint --dataset-val ../data/imdb62/test_i2ro.csv --do_val 0 --prefix="imdb_"
-
To obtain metrics
ACCURACY
andCONSISTENCY
(as defined in the paper), and to obtain a csv file that can be opened with Excel/Google Sheets for easy analysis:python check_metrics.py --pred_data /path/to/csv/file --gold_data ../data/imdb62/test_i2ro.csv --human /path/to/human-eval/csv