This repository is an implementation of the summarization model presented in Question Answering as an Automatic Evaluation Metric for News Article Summarization.
This repository started as a fork from OpenNMT-py. You can find helpful discussions and explanations in the repository.
Pretrain the model as explained in the likes above. Or download one of the downloadable models in the this explanations. I used the ada6_bridge_oldcopy_tagged_acc_54.17_ppl_11.17_e20.pt
model as my pretrained model.
In order to fine tune as noted in the paper
python train.py -save_model models/entities_attn -data path/to/data/with/filenames -train_from path/to/model/ada6_bridge_oldcopy_tagged_acc_54.17_ppl_11.17_e20.pt -copy_attn -global_attention mlp -word_vec_size 128 -rnn_size 512 -layers 1 -encoder_type brnn -epochs 20 -max_grad_norm 2 -dropout 0. -batch_size 16 -optim adagrad -learning_rate 0.15 -adagrad_accumulator_init 0.1 -reuse_copy_attn -copy_loss_by_seqlength -bridge -seed 777 -gpuid 0 > entities_attn.txt 2>&1
.
Generation is done using the translate.py
script
python translate.py -gpu 0 -src_seq_length_trunc 400 -batch_size 20 -beam_size 5 -model path/to/model/ada6_bridge_oldcopy_tagged_acc_54.17_ppl_11.17_e20.pt -src path/to/data/test.txt.src -output testout/file.out -min_length 35 -verbose -stepwise_penalty -coverage_penalty summary -length_penalty wu -alpha 0.9 -beta 0.5 -gamma 0.5 -verbose -block_ngram_repeat 3 -ignore_when_blocking "." "</t>" "<t>" > translating.txt 2>&1 &
@inproceedings{eyal-etal-2019-question,
title = "Question Answering as an Automatic Evaluation Metric for News Article Summarization",
author = "Eyal, Matan and
Baumel, Tal and
Elhadad, Michael",
booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)",
month = jun,
year = "2019",
address = "Minneapolis, Minnesota",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/N19-1395",
pages = "3938--3948",
}