Skip to content

Sampson-Lee/DSCT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DSCT

By Xinpeng Li, Teng Wang, Jian Zhao, Shuyi Mao, Jinbao Wang, Feng Zheng, Xiaojiang Peng†, Xuelong Li†

This repository contains the official implementation of the paper Two in One Go: Single-stage Emotion Recognition with Decoupled Subject-context Transformer (ACM MM 2024).

Introduction

Decoupled Subject-Context Transformer (DSCT) is a single-stage emotion recognition approach for simultaneous subject localization and emotion classification.

introduction

License

This project is released under the Apache 2.0 license.

Installation

Acknowledgement

The project is built upon deformable_detr.

Requirements

  • Linux
  • CUDA >= 9.2
  • GCC: 5.4 <= GCC <= 9.5
  • Python >= 3.7
  • PyTorch >= 1.5.1
  • torchvision >= 0.6.1

Installation

Clone the repo:

git clone [email protected]:Sampson-Lee/DSCT.git
cd DSCT

Create and activate the environment:

conda create -n dsct python=3.10 pip
conda activate dsct 

Install PyTorch and dependencies:

pip install torch torchvision torchaudio
pip install -r requirements.txt

Compile CUDA operators:

cd ./models/ops
sh ./make.sh

Test the operators (ensure all checks pass until CUDA runs out of memory):

python test.py

Usage

Dataset

Please download EMOTIC dataset and CAER-S dataset.

We provided the coco-format annotations, i.e. emotic_{train, val, test}_bi.json and caer_{train, val, test}.json, and preprocessing scripts in ./datasets.

Modify the dataset code or directory. Here is the directory tree I am following:

emotic
└── images
    ├── ade20k
    │   └── images
    │        ├── xxx.jpg
    ├── framesdb
    │   └── images
    │        ├── xxx.jpg
    ├── mscoco
    │   └── images
    │        ├── xxx.jpg
    └── emodb_small
        └── images
             ├── xxx.jpg
caer
├── train
│   ├── Anger
│   │   ├── xxx.jpg
│   ├── Disgust
│   ├── Fear
│   ├── Happy
│   ├── Neutral
│   ├── Sad
│   └── Surprise
└── test
    ├── Anger
    ├── Disgust
    ├── Fear
    ├── Happy
    ├── Neutral
    ├── Sad
    └── Surprise                        

Running

Please place the pretrained weights of deformable detr in the working directory.

Then, follow run.sh to conduct training, testing, or visualization.

Below is an explained example command:

YOUR_DATA_PATH=/home/lxp/data/emotic
CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch \
    --nproc_per_node=1 \
    --master_port=29507 \
    --use_env main.py \
    --dataset_file=emotic \   # Choose the dataset between emotic and caer
    --binary_flag=1 \         # Set 1 for multi-label tasks; 0 for multi-class tasks
    --backbone=resnet50 \     # Select resnet50 or resnet101
    --detr=deformable_detr_dsct \
    --model=deformable_transformer_dsct \
    --batch_size=1 \          # Adjust the batch size
    --cls_loss_coef=5 \       # Determine the coefficient of the classification loss
    --data_path=$YOUR_DATA_PATH \
    --output_dir=$YOUR_DATA_PATH/checkpoints \
    --epochs=50 \
    --lr_drop=40 \
    --num_queries=4 \         # Configure the number of queries
    --pretrained_weights=./r50_deformable_detr-checkpoint.pth \ # leverage the pretrained weights of deformable detr

Citation

If you find this project helpful, please consider citing our paper and starring our repository.

@inproceedings{li2024two,
  title={Two in One Go: Single-stage Emotion Recognition with Decoupled Subject-context Transformer},
  author={Li, Xinpeng and Wang, Teng and Zhao, Jian and Mao, Shuyi and Wang, Jinbao and Zheng, Feng and Peng, Xiaojiang and Li, Xuelong},
  booktitle={Proceedings of the 32nd ACM International Conference on Multimedia},
  pages={9340--9349},
  year={2024}
}

About

[ACM MM 2024] Official implementation of DSCT

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published