This repository contains for paper https://arxiv.org/abs/2005.00115 to appear in ACL2020.
conda install -n fresh python=3.8
conda activate fresh
pip install -r requirements.txt
python -m spacy download en
-
Datasets
: Folder to store datasets. For each dataset, please run the processing code in Process.ipynb file in respective folders. RunPYTHONPATH=$(pwd) jupyter lab
and navigate to each Dataset folder -
Rationale_Analysis/models
: Folder to store allennlp modelsclassifiers
: Models that do actually learningsaliency_scorer
: Takes a trained model and return saliency scorers for inputsrationale_extractors
: Models that take saliency scores and generate rationales by thresholding.rationale_generators
: Models that take in thresholded rationales and train an extractor model.base_predictor.py
: Simple predictor to use with allennlp predict command as needed
-
plugins
: Subcommands to run saliency and rationale extractors since allennlp existing command semantics doesn't map quite as well to what we wanna do. -
Rationale_Analysis/training_config
: Contains jsonnet training configs to use with allennlp for models described above. -
Rationale_Analysis/commands
: Actual bash scripts to run stuff. -
Rationale_Analysis/data/dataset_readers
: Contains dataset readers to work with Allennlp.base_reader.py
: Code to load actual datasets (jsonl with 4 fields - document, query, label, Optional[rationale])saliency_reader.py
: Read output of Saliency scorer to pass into rationale extractors.extractor_reader.py
: Reader thresholded rationales to train extractor model.
In the following run scripts, the environment variables below can take these values -
- DATASET_NAME in {evinf, movies, SST, agnews, multirc}
- SALIENCY in {wrapper, simple_gradient} [Please note, wrapper is just another name for Attention based saliency]
- THRESHOLDER in {top_k, contiguous}
- MAX_LENGTH_RATIO in [0, 1] -- desired length of rationales
- BERT_TYPE in {bert-base-uncased, roberta-base, allenai/scibert_scivocab_uncased}
- BSIZE = batch_size (Our default values are in Rationale_Analysis/default_values.json)
We use bert-base-uncased for {SST, agnews, movies}, roberta-base for multirc and scibert for evinf.
- HUMAN_PROB in [0, 1] -- amount of human supervision to use for rationales
CUDA_DEVICE=0 \
DATASET_NAME=${DATASET_NAME} \
CLASSIFIER=bert_classification \
BERT_TYPE=${BERT_TYPE} \
EXP_NAME=fresh \
MAX_LENGTH_RATIO=${MAX_LENGTH_RATIO} \
SALIENCY=${SALIENCY} \
THRESHOLDER=${THRESHOLDER} \
EPOCHS=20 \
BSIZE=${BSIZE} \
bash Rationale_Analysis/commands/fresh/fresh_script.sh
CUDA_DEVICE=0 \
DATASET_NAME=$DATASET_NAME \
CLASSIFIER=bert_classification \
BERT_TYPE=$BERT_TYPE \
EXP_NAME=fresh \
MAX_LENGTH_RATIO=$MAX_LENGTH_RATIO \
SALIENCY=$SALIENCY \
THRESHOLDER=$THRESHOLDER \
EPOCHS=20 \
BSIZE=$BSIZE \
HUMAN_PROB=$HUMAN_PROB \
bash Rationale_Analysis/commands/fresh/fresh_with_extractor_script.sh
MU/LAMBDA are hyperparameters for regularizer. Values we used after hyperparam search are in file Rationale_Analysis/default_values.json.
CUDA_DEVICE=0 \
DATASET_NAME=$DATASET_NAME \
CLASSIFIER=bert_encoder_generator \
BERT_TYPE=$BERT_TYPE \
EXP_NAME=fresh \
MAX_LENGTH_RATIO=$MAX_LENGTH_RATIO \
EPOCHS=20 \
BSIZE=$BSIZE \
MU=$MU \
LAMBDA=$LAMBDA \
bash Rationale_Analysis/commands/encgen/experiment_script.sh
CUDA_DEVICE=0 \
DATASET_NAME=$DATASET_NAME \
CLASSIFIER=bert_kuma_encoder_generator \
BERT_TYPE=$BERT_TYPE \
EXP_NAME=fresh \
MAX_LENGTH_RATIO=$MAX_LENGTH_RATIO \
EPOCHS=20 \
BSIZE=$BSIZE \
LAMBDA_INIT=1e-5 \
bash Rationale_Analysis/commands/encgen/experiment_script.sh
- For Lei et al,
CUDA_DEVICE=0 \
EPOCHS=20 \
CLASSIFIER=bert_encoder_generator \
python Rationale_Analysis/experiments/run_for_random_seeds.py \
--script-type encgen/experiment_script.sh \
--all-data;
python Rationale_Analysis/experiments/random_seeds_results.py --output-dir outputs/ --lei
- For Bastings et al,
CUDA_DEVICE=0 \
EPOCHS=20 \
CLASSIFIER=bert_kuma_encoder_generator \
python Rationale_Analysis/experiments/run_for_random_seeds.py \
--script-type encgen/experiment_script.sh \
--all-data;
python Rationale_Analysis/experiments/random_seeds_results.py --output-dir outputs/ --kuma
- For Fresh,
CUDA_DEVICE=0 \
EPOCHS=20 \
CLASSIFIER=bert_classification \
python Rationale_Analysis/experiments/run_for_random_seeds.py \
--script-type fresh/experiment_script.sh \
--all-data;
python Rationale_Analysis/experiments/random_seeds_results.py --output-dir outputs/
- For Lei et al,
CUDA_DEVICE=0 \
EPOCHS=20 \
CLASSIFIER=bert_encoder_generator \
python Rationale_Analysis/experiments/run_for_random_seeds.py \
--script-type encgen/experiment_script.sh \
--all-data \
--defaults-file Rationale_Analysis/second_cut_point.json
- For Fresh,
CUDA_DEVICE=0 \
EPOCHS=20 \
CLASSIFIER=bert_classification \
python Rationale_Analysis/experiments/run_for_random_seeds.py \
--script-type fresh/experiment_script.sh \
--all-data \
--defaults-file Rationale_Analysis/second_cut_point.json
Results:
python Rationale_Analysis/rationale_lengths_results.py --output-dir outputs/ --min-scale 0.3 --max-scale 1.0;
- For Lei et al Model,
for human_prob in 0.0 0.2 0.5 1.0;
do
CUDA_DEVICE=0 \
EPOCHS=20 \
DATASET_NAME=$DATASET_NAME \
HUMAN_PROB=$human_prob \
CLASSIFIER=bert_encoder_generator_human \
python Rationale_Analysis/experiments/run_for_random_seeds.py \
--script-type encgen/supervised_experiment_script.sh;
done;
- For Fresh Model,
for human_prob in 0.0 0.2 0.5 1.0;
do
CUDA_DEVICE=0 \
EPOCHS=20 \
DATASET_NAME=$DATASET_NAME \
HUMAN_PROB=$human_prob \
CLASSIFIER=bert_classification \
python Rationale_Analysis/experiments/run_for_random_seeds.py \
--script-type fresh/fresh_with_extractor_script;
done;
Results :
python Rationale_Analysis/supervised_rationale_plot.py --output-dir outputs/ --dataset $DATASET_NAME --min-scale 0.0 --max-scale 1.0;
If you are using this code, please cite the following
@inproceedings{jain-etal-2020-learning,
title = "{L}earning to Faithfully Rationalize by Construction",
author = "Jain, Sarthak and
Wiegreffe, Sarah and
Pinter, Yuval and
Wallace, Byron C.",
booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
month = jul,
year = "2020",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2020.acl-main.409",
doi = "10.18653/v1/2020.acl-main.409",
pages = "4459--4473",
}