Skip to content

Latest commit

 

History

History
46 lines (29 loc) · 2.42 KB

README.md

File metadata and controls

46 lines (29 loc) · 2.42 KB

DPSL-ASR (Dual-Path Style Learning for End-to-End Noise-Robust Automatic Speech Recognition)

Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition

Interactive Feature Fusion for End-to-End Noise-Robust Speech Recognition

Introduction

DPSL-ASR is a novel method for end-to-end noise-robust speech recognition. It has extended our prior work IFF-Net (Interactive Feature Fusion Network) with dual-path inputs and style learning, which achieved better ASR performance on RATS Channel-A dataset and CHiME-4 1-Channel Track Dataset.

Left figure: (a) joint SE-ASR approach, (b) IFF-Net baseline, (c) the proposed DPSL-ASR approach.

Right figure: back-end ASR module with style learning and consistency loss in our DPSL-ASR. The dashed lines denote sharing parameters.

If you find DPSL-ASR useful in your research, please use the following BibTeX entry for citation:

@article{hu2022dualpath,
  title={Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition}, 
  author={Hu, Yuchen and Hou, Nana and Chen, Chen and Chng, Eng Siong},
  journal={arXiv preprint arXiv:2203.14838},
  year={2022}
}

@article{hu2021interactive,
  title={Interactive Feature Fusion for End-to-End Noise-Robust Speech Recognition},
  author={Hu, Yuchen and Hou, Nana and Chen, Chen and Chng, Eng Siong},
  journal={arXiv preprint arXiv:2110.05267},
  year={2021}
}

Usage

Our code implementation is based on ESPnet. You can intall it directly using our provided ESPnet(v.0.9.6) folder, or install from official website and then add files from our repo. Use the command pip install -e . to install ESPnet.

In our foler, the running scripts are at egs2/rats_chA/asr_with_enhancement/{run_rats_chA_dpsl_asr, rats_chA_dpsl_asr}.sh, and the network code are at espnet2/{asr/, enh/, layers/}.

Tips:

  1. To go over the entire project, please start from the script egs2/rats_chA/asr_with_enhancement/run_rats_chA_dpsl_asr.sh
  2. To read the network code only, please start from the script espnet2/asr/dpsl_asr.py