MuMiN Baselines

This repository contains implementations of baseline models on the MuMiN dataset, introduced in the paper Nielsen and McConville: MuMiN: A Large-Scale Multilingual Multimodal Fact-Checked Misinformation Social Network Dataset (2021).

Reproducing model baselines

To perform the baselines we have centralised all the training scripts into the src/train.py script. This can be called with many different parameters, of which the mandatory ones are the following:

model_type: This picks the type of model you want to benchmark. Can be 'claim', 'tweet', 'image' or 'graph.
size: The size of the MuMiN dataset to perform the benchmark.
task: Only relevant if model_type=='graph', in which case it determines whether you want to benchmark the graph model on the claim classification task or the tweet classification task.

Call python src/train.py --help for a more detailed list of all the arguments that can be used.

Random/majority baselines

The random and majority baselines are calculated based on the proportion of misinformation labels in the dataset. See the random_majority_macro_f1.ipynb notebook for details.

Related Repositories

MuMiN website, the central place for the MuMiN ecosystem, containing tutorials, leaderboards and links to the paper and related repositories.
MuMiN, containing the paper in PDF and LaTeX form.
MuMiN-build, containing the scripts for the Python package mumin, used to compile the dataset and export it to various graph machine learning frameworks.
MuMiN-trawl, containing all the scripts to build MuMiN from scratch.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

MuMiN Baselines

Reproducing model baselines

Random/majority baselines

Related Repositories

Files

README.md

Latest commit

History

README.md

File metadata and controls

MuMiN Baselines

Reproducing model baselines

Random/majority baselines

Related Repositories