The implementation of the paper "Can Question Rewriting Help Conversational Question Answering?":
Can Question Rewriting Help Conversational Question Answering?. Etsuko Ishii, Yan Xu, Samuel Cahyawijaya, Bryan Wilie. Insights@ACL2022 [PDF]
If you use any source codes included in this toolkit in your work, please cite the following paper:
@inproceedings{ishii-etal-2022-question, title = "Can Question Rewriting Help Conversational Question Answering?", author = "Ishii, Etsuko and Xu, Yan and Cahyawijaya, Samuel and Wilie, Bryan", booktitle = "Proceedings of the Third Workshop on Insights from Negative Results in NLP", month = may, year = "2022", address = "Dublin, Ireland", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2022.insights-1.13", pages = "94--99", }
We implement Python 3.7 and PyTorch 1.10.0, and the other packages follow requirements.txt
.
Please download dependencies with pip install -r requirements.txt
.
If you want to log your training with wandb, first install wandb by pip install wandb
.
Then, create an account on wandb and log in by wandb login
.
You can download all the datasets used in our experiments here.
Please unzip and place it under data
directory.
We use QReCC dataset (Anantha et al., 2021) and CANARD dataset (Elgohary et al., 2019) as the question rewriting datasets.
We use CoQA (Reddy et al., 2019) and QuAC (Choi et al., 2018) as the conversational question answering datasets. Since the test set is not publicly available, we randomly sample 5% of dialogues in the training set and adopt them as our validation set and report the test results on the original development set for the CoQA experiments. Note that the QuAC split is based on EXCORD (Kim et al., 2021), and the CoQA split is done by ourselves.
- Train a QR model with:
CUDA_VISIBLE_DEVICES=0 sh run_qrewrite.sh <dataset: canard|qrecc> <output_dir: save/gpt2-canard|save/gpt2-qrecc> <master_port: 10000>
- Evaluate the QR model with:
python infer_qrewrite.py --dataset [canard/qrecc] --exp [gpt2-canard2/gpt2-qrecc] --split validation -cu 0 --overwrite_cache --pretrained_model gpt2 --batchify --eval_bsz 16
- Train a QA model with:
CUDA_VISIBLE_DEVICES=0,1 sh run_coqa.sh
or, for QuAC, train with:
CUDA_VISIBLE_DEVICES=0 sh run_quac.sh # quac is runnable with only one GPU
- Evaluate the QA model with:
sh eval_convqa.sh
Our code for Proximal Policy Optimization (PPO) is modified from trl.
- To train the QR model with PPO, run:
sh run_ppo.sh
You can download trained models: QReCC+CoQA, CANARD+CoQA, QReCC+QuAC, CANARD+QuAC.
- To evaluate the trained model, run:
sh eval_ppo.sh
- If you want to evaluate models using the metrics reported in the leaderboards, run:
python src/modules/convqa_evaluator.py --data coqa --pred-file <path to the model folder>/predictions_test.json --data-file data/coqa/coqa-dev-v1.0.json --out-file <path to the model folder>/all_test_results.txt
python src/modules/convqa_evaluator.py --data quac --pred-file <path to the model folder>/predictions_test.json --data-file data/quac/val-v0.2.json --out-file <path to the model folder>/all_test_results.txt
- We also support the REINFORCE algorithm to train the QR model. You can run:
sh run_reinforce.sh
Note that you can evaluate trained models by simply modifying the path to the models of eval_ppo.sh
even if the QR model is trained with REINFORCE.
We evaluate with a simple supervised learning approach using rewrites provided by CANARD. You can download the QuAC subset that has the CANARD annotations here.
- To evaluate the CANARD annotations with the QA model trained on QuAC, simply change the path to datasets in
src/data_utils/quac.py
and run:
sh eval_convqa.sh
- To train another QA model with the CANARD annotations, change the path to datasets in
src/data_utils/quac.py
in the same way as above, run:
sh run_quac.sh
First, we generate 10 possible rewrites using top-k sampling for all the questions of the CQA datasets. To guarantee the quality of the rewrites, we select the best F1 scoring ones from every 10 candidates and use them to teach another QR model how to reformulate questions.
- To generate annotations, run:
sh run_augmentation.sh
- Then, train a QR model with
run_qrewrite.py
(refer torun_qrewrite.sh
) by changing the path to datasets.