BUT-FIT at SemEval-2019 Task 7: Determining the Rumour Stance with Pre-Trained Deep Bidirectional Transformers

Authors:

Martin Fajčík
Lukáš Burget
Pavel Smrž

In case of any questions, please mail to [email protected].

This is a official implementation we have used in the SemEval-2019 Task 7. Our publication is available here. All models have been trained on RTX 2080 Ti (with 12 GB memory).

Bibtex citation

@inproceedings{fajcik2019but,
  title={BUT-FIT at SemEval-2019 Task 7: Determining the Rumour Stance with Pre-Trained Deep Bidirectional Transformers},
  author={Fajcik, Martin and Smrz, Pavel and Burget, Lukas},
  booktitle={Proceedings of the 13th International Workshop on Semantic Evaluation},
  pages={1097--1104},
  year={2019}
}

Replication of results

Replication from ensemble predictions

Since each trained model is saved in checkpoints of size 1.3GB, we do not provide these online. To replicate the ensemble results from paper, we provide a set of pre-calculated predictions from these trained models per validation set and per test set. The predictions on validation and test sets are saved as numpy arrays in predictions folder.

Running replicate_ensemble_results.py directly replicates ensemble results.

Replication via training new models

Make sure the value of "active_model" in configurations/config.json is set to "BERT_textonly"
Run solver.py

Note: Mind that BERT often gets stuck in local minima. In our experiments, we took only results with 55 F1 on validation data or better. For the sake of convenience, you may want to modify last line of method create_model found in solutionsA.py file to call modelframework.fit_multiple instead of modelframework.fit to run model training multiple times.

Duration of 1 training: ~ 30 minutes

Replication of BiLSTM+SelfAtt baseline result

Change value of "active_model" in configurations/config.json to "self_att_with_bert_tokenizer"
Run solver.py

Duration of 1 training: ~ 2.7 minutes

Prediction examples

Structured self-attention with BERT-pretrained embeddings (BiLSTM+SelfAtt)

tsv file containing predictions, ground truth, confidence and model inputs of trained BiLSTM+SelfAtt model is available HERE.

BERT

tsv file containing predictions, ground truth, confidence and model inputs of trained TOP-N_s ensemble (our best published result) is available HERE.

Visualisation

Attention from BERT - images

The images of multi-head attention from all heads and layers from trained BERT model for a fixed data point are available for download HERE.

Attention in structured self-attention with BERT-pretrained embeddings (BiLSTM+SelfAtt)

xlsx file containing attention visualisation per each input of validation set in trained BiLSTM+SelfAtt model is available HERE. The column description is shown in its first row. For each example, column 'text' contains numerical values of attention and visualisations of average over all attention "heads" and attention of each "head" (in this row order). Note, that at time attention is made, the input is already passed via 1-layer BiLSTM (see original paper for more details).

F1's sensitivity to misclassification

This table shows a relative F1 difference per 1 sample in case of each class misclassification (in other words increase in F1 score, if 1 more example of this class is classified correctly)

Class	F1 difference in %
Query	0.219465
Support	0.1746285
Deny	0.2876426
Comment	0.0849897

Processing the original data into model's format and running the training

install environment from environment.yml
conda env create -f environment.yml
activate environment
source activate Rumoureval2019
download en models for spacy
python -m spacy download en
download data and change the paths accordingly, all paths can be changed in data_preprocessing/paths.py
run everything from root directory (so the working directory is root directory), set PYTHONPATH for root directory
export PYTHONPATH=<your_project_root_directory>
process data
python data_preprocessing/prep_pipeline.py
The preprocessed data should be available in data_preprocessing/saved_data_RumEval2019

Then you should be able to run the model BERT_textonly
python solver.py

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
configurations		configurations
data_analysis		data_analysis
data_preprocessing		data_preprocessing
ensembling		ensembling
introspection/selfatt_bert		introspection/selfatt_bert
neural_bag		neural_bag
official_baseline		official_baseline
official_scripts		official_scripts
predictions		predictions
task_A		task_A
task_B		task_B
utils		utils
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
replicate_ensemble_results.py		replicate_ensemble_results.py
solutionsA.py		solutionsA.py
solver.py		solver.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BUT-FIT at SemEval-2019 Task 7: Determining the Rumour Stance with Pre-Trained Deep Bidirectional Transformers

Bibtex citation

Table of contents

Replication of results

Replication from ensemble predictions

Replication via training new models

Replication of BiLSTM+SelfAtt baseline result

Prediction examples

Structured self-attention with BERT-pretrained embeddings (BiLSTM+SelfAtt)

BERT

Visualisation

Attention from BERT - images

Attention in structured self-attention with BERT-pretrained embeddings (BiLSTM+SelfAtt)

F1's sensitivity to misclassification

Processing the original data into model's format and running the training

About

Releases

Packages

Languages

License

MFajcik/RumourEval2019

Folders and files

Latest commit

History

Repository files navigation

BUT-FIT at SemEval-2019 Task 7: Determining the Rumour Stance with Pre-Trained Deep Bidirectional Transformers

Bibtex citation

Table of contents

Replication of results

Replication from ensemble predictions

Replication via training new models

Replication of BiLSTM+SelfAtt baseline result

Prediction examples

Structured self-attention with BERT-pretrained embeddings (BiLSTM+SelfAtt)

BERT

Visualisation

Attention from BERT - images

Attention in structured self-attention with BERT-pretrained embeddings (BiLSTM+SelfAtt)

F1's sensitivity to misclassification

Processing the original data into model's format and running the training

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages