Skip to content

Verifying the reproducibility of "TransTab: Learning Transferable Tabular Transformers Across Tables" as part of COMP6258 module

Notifications You must be signed in to change notification settings

COMP6258-Reproducibility-Challenge/TransTab-Reproducibility

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

56 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A Reproducibility Study Of Transtab: Learning Transferable Tabular Transformer Across Tables

Created by Alberto Tamajo, Jakub Dylag, Alessandro Nerla and Laurin Lanz.

Transtab architecture. Figure from original paper

Introduction

In this work, we verify the reproducibility of TransTab: Learning Transferable Tabular Transformers Across Tables as part of COMP6258 module.

The ubiquity of tabular data in machine learning led Wang & Sun (2022) to introduce a versatile tabular learning framework, Transferable Tabular Transformer (TransTab), capable of modelling variable-column tables. Furthermore, they proposed a novel technique that enables supervised or self-supervised pretraining on multiple tables, as well as finetuning on the target dataset. Given the potential impact of their work, we aim to verify their claims by trying to reproduce their results. Specifically, we try to corroborate the ’methods’ and ’results’ reproducibility of their paper.

The results of our reproducibility study are summarised in Report.pdf

Experiment results

Our experiment results are saved in this repository as pickle files:

  • Supervised learning: supervised_learning.pickle
  • Feature Incremental Learning: incremental_learning.pickle
  • Transfer Learning: transfer_learning.pickle
  • Zero-Shot Learning: zeroshot_learning.pickle
  • Supervised and Self-supervised Pretraining: across_table_pretraining_finetuning.pickle

How to run the reproducibility experiments

Clone this project

The first step is to clone this project:

git clone https://github.com/COMP6258-Reproducibility-Challenge/TransTab-Reproducibility.git
cd Transtab-Reproducibility/

Conda environment

The second step is to create a conda environment from our environment.yml:

conda env create -f environment.yml
conda activate TranstabReproducibility

Run the desired reproducibility experiment

The third step is to run the desired reproducibility experiment:

  • Supervised learning: python supervised_learning.py
  • Feature Incremental Learning: python incremental_learning.py
  • Transfer Learning: python transfer_learning.py
  • Zero-Shot Learning: python zeroshot_learning.py
  • Supervised and Self-supervised Pretraining: python supervised_selfsupervised_pretrain_finetuning.py

Google colab alternative

Alternatively, you can upload this repository's files into a Google colab session and use the Trasntab.ipynb file.

Code

We verified Transtab's reproducibility by leveraging Transtab's code package v. 0.0.2. On the 05/04/23 v. 0.0.5 was released. In the following, we list our code and the one retrieved from the original repository.

  • Our code:
    • Rankings.ipynb
    • Transtab.ipynb
    • incremental_learning.py
    • supervised_learning.py
    • supervised_selfsupervised_pretrain_finetuning.py
    • transfer_learning.py
    • zeroshot_learning.py
  • Original code:
    • constants.py
    • dataset.py
    • evaluator.py
    • load.py
    • modeling_transtab.py
    • trainer.py
    • trainer_utils.py
    • transtab.py

About

Verifying the reproducibility of "TransTab: Learning Transferable Tabular Transformers Across Tables" as part of COMP6258 module

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published