MODDP: A Multi-modal Open-domain Chinese Dataset for Dialogue Discourse Parsing

Implementation of the paper MODDP: A Multi-modal Open-domain Chinese Dataset for Dialogue Discourse Parsing. The paper has been accepted in Findings ACL 2024.

Abstract

Dialogue discourse parsing (DDP) aims to capture the relations between utterances in the dialogue. In everyday real-world scenarios, dialogues are typically multi-modal and cover open-domain topics. However, most existing widely used benchmark datasets for DDP contain only textual modality and are domain-specific. This makes it challenging to accurately and comprehensively understand the dialogue without multi-modal clues, and prevents them from capturing the discourse structures of the more prevalent daily conversations. This paper proposes MODDP, the first multi-modal Chinese discourse parsing dataset derived from open-domain daily dialogues, consisting 864 dialogues and 18,114 utterances, accompanied by 12.7 hours of video clips. We present a simple yet effective benchmark approach for multi-modal DDP. Through extensive experiments, we present several benchmark results based on MODDP. The significant improvement in performance from introducing multi-modalities into the original textual unimodal DDP model demonstrates the necessity of integrating multi-modalities into DDP.

Requirements

Pytorch >= 2.1.1

Transformers >= 4.18.0

Data Preparation

You can directly load the text data from the dataset folder and download the image and audio features from all_features.pkl.

If the link is broken or you need the original video data, please contact [email protected].

Training

python main.py \
    --config_file ./config.cfg \
    --seed 42 \
    --postfix experiments/train \
    --text_plm_name_or_path /path/to/roberta \
    --vision_plm_name_or_path /path/to/vit \
    --audio_plm_name_or_path /path/to/wav2vec2 \
    --bert_path /path/to/bert \

Or run directly

bash run.sh

Predict and Evaluation

python main.py \
    --config_file ./config.cfg \
    --seed 42 \
    --postfix experiments/predict \
    --text_plm_name_or_path /path/to/roberta \
    --vision_plm_name_or_path /path/to/vit \
    --audio_plm_name_or_path /path/to/wav2vec2 \
    --bert_path /path/to/bert \
    --ckpt_path /path/to/best/model \
    --train False \
    --predict True \

Citation

If you find this repo helpful, please cite the following paper:

    title = "{MODDP}: A Multi-modal Open-domain {C}hinese Dataset for Dialogue Discourse Parsing",
    author = "Gong, Chen  and
      Kong, DeXin  and
      Zhao, Suxian  and
      Li, Xingyu  and
      Fu, Guohong",
    editor = "Ku, Lun-Wei  and
      Martins, Andre  and
      Srikumar, Vivek",
    booktitle = "Findings of the Association for Computational Linguistics ACL 2024",
    month = aug,
    year = "2024",
    address = "Bangkok, Thailand and virtual meeting",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.findings-acl.628",
    doi = "10.18653/v1/2024.findings-acl.628",
    pages = "10561--10573",
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
dataset		dataset
module		module
script		script
utils		utils
.gitignore		.gitignore
README.md		README.md
config.cfg		config.cfg
main.py		main.py
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MODDP: A Multi-modal Open-domain Chinese Dataset for Dialogue Discourse Parsing

Abstract

Requirements

Data Preparation

Training

Predict and Evaluation

Citation

About

Releases

Packages

Contributors 2

Languages

Suda-iaiNLP/MODDP

Folders and files

Latest commit

History

Repository files navigation

MODDP: A Multi-modal Open-domain Chinese Dataset for Dialogue Discourse Parsing

Abstract

Requirements

Data Preparation

Training

Predict and Evaluation

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages