CountEx-VQA - Counterfactual Explanations for Visual Question Answering

This repository contains the code for my Master's thesis "Generating Counterfactual Images for VQA by Editing Question Critical Objects" at WeST, University of Koblenz-Landau. The thesis was supervised by Prof. Dr. Matthias Thimm and Dr. Zeyd Boukhers. This repo was made by Timo Hartmann.

We developed this code to generate counterfactual images for a VQA model to increase its interpretability. Specifically, the proposed method, CountEx-VQA, uses a Generative Adversarial Network to translate an image into a minimally different counterfactual such that the prediction of a VQA model based on a given question changes.

Installation

Requirements

The code in this repository requires Python 3. We advise you to run the code inside an Anaconda environment:

conda create --name countex-vqa python=3.8
source activate countex-vqa

Next, clone the repository and install the requirements:

cd $HOME
git clone https://github.com/tihartmann/CountEx-VQA.git
cd CountEx-VQA
pip install -r requirements.txt

Data

The data used in the thesis' experiments and the pre-trained MUTAN VQA model can be downloaded as follows:

cd tools/
./download.sh

Reproduce Results

To reproduce the results from the experiments described in the thesis, run the following code. Note that, by default, the training script uses CUDA if available. If you want to manually specify the device, you can do so by changing line 3 in the config.py file.

cd $HOME/CountEx-VQA
python train.py

If you want to start the training procedure using the pre-trained CountEx-VQA model, download the weights using the link below and set LOAD_MODEL = TRUE in line 15 of config.py.

Pretrained Models

The pretrained generator and discriminator can be downloaded here.

Web Demo

A Flask-based web demo is available inside $HOME/CountEx-VQA/demo. To run the demo, you must download the pretrained generator using the link provided in Pretrained Models and store it inside $HOME/CountEx-VQA/model. Next, you can run the web demo locally:

cd $HOME/CountEx-VQA/demo
export FLASK_APP=webserver
flask run

Acknowledgements

This project uses a pre-trained MUTAN VQA-Model as described by Ben-younes et al. (2017) in their paper MUTAN: Multimodal Tucker Fusion for Visual Question Answering. The code and pre-trained models can be found in this repository. For spectral normalization, I used this code.

Cite us

If you used our code, please cite our paper:

@Article{s22062245,
AUTHOR = {Boukhers, Zeyd and Hartmann, Timo and Jürjens, Jan},
TITLE = {COIN: Counterfactual Image Generation for Visual Question Answering Interpretation},
JOURNAL = {Sensors},
VOLUME = {22},
YEAR = {2022},
NUMBER = {6},
ARTICLE-NUMBER = {2245},
URL = {https://www.mdpi.com/1424-8220/22/6/2245},
ISSN = {1424-8220},
DOI = {10.3390/s22062245}
}

Name		Name	Last commit message	Last commit date
Latest commit History 134 Commits
VQA/vqa_pytorch		VQA/vqa_pytorch
__pycache__		__pycache__
demo		demo
model		model
tools		tools
.gitattributes		.gitattributes
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
config.py		config.py
dataset.py		dataset.py
requirements.txt		requirements.txt
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CountEx-VQA - Counterfactual Explanations for Visual Question Answering

Installation

Requirements

Data

Reproduce Results

Pretrained Models

Web Demo

Acknowledgements

Cite us

About

Releases

Packages

Contributors 3

Languages

License

tihartmann/CountEx-VQA

Folders and files

Latest commit

History

Repository files navigation

CountEx-VQA - Counterfactual Explanations for Visual Question Answering

Installation

Requirements

Data

Reproduce Results

Pretrained Models

Web Demo

Acknowledgements

Cite us

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages