Created by Alberto Tamajo, Jakub Dylag, Alessandro Nerla and Laurin Lanz.
Transtab architecture. Figure from original paper
In this work, we verify the reproducibility of TransTab: Learning Transferable Tabular Transformers Across Tables as part of COMP6258 module.
The ubiquity of tabular data in machine learning led Wang & Sun (2022) to introduce a versatile tabular learning framework, Transferable Tabular Transformer (TransTab), capable of modelling variable-column tables. Furthermore, they proposed a novel technique that enables supervised or self-supervised pretraining on multiple tables, as well as finetuning on the target dataset. Given the potential impact of their work, we aim to verify their claims by trying to reproduce their results. Specifically, we try to corroborate the ’methods’ and ’results’ reproducibility of their paper.
The results of our reproducibility study are summarised in Report.pdf
Our experiment results are saved in this repository as pickle files:
- Supervised learning:
supervised_learning.pickle
- Feature Incremental Learning:
incremental_learning.pickle
- Transfer Learning:
transfer_learning.pickle
- Zero-Shot Learning:
zeroshot_learning.pickle
- Supervised and Self-supervised Pretraining:
across_table_pretraining_finetuning.pickle
The first step is to clone this project:
git clone https://github.com/COMP6258-Reproducibility-Challenge/TransTab-Reproducibility.git
cd Transtab-Reproducibility/
The second step is to create a conda environment from our environment.yml
:
conda env create -f environment.yml
conda activate TranstabReproducibility
The third step is to run the desired reproducibility experiment:
- Supervised learning:
python supervised_learning.py
- Feature Incremental Learning:
python incremental_learning.py
- Transfer Learning:
python transfer_learning.py
- Zero-Shot Learning:
python zeroshot_learning.py
- Supervised and Self-supervised Pretraining:
python supervised_selfsupervised_pretrain_finetuning.py
Alternatively, you can upload this repository's files into a Google colab session and use the Trasntab.ipynb
file.
We verified Transtab's reproducibility by leveraging Transtab's code package v. 0.0.2
. On the 05/04/23 v. 0.0.5
was released. In the following, we list our code and the one retrieved from the original repository.
- Our code:
Rankings.ipynb
Transtab.ipynb
incremental_learning.py
supervised_learning.py
supervised_selfsupervised_pretrain_finetuning.py
transfer_learning.py
zeroshot_learning.py
- Original code:
constants.py
dataset.py
evaluator.py
load.py
modeling_transtab.py
trainer.py
trainer_utils.py
transtab.py