From f42716d709245b6243bef229322883803d4071aa Mon Sep 17 00:00:00 2001 From: Ziyang Ma Date: Sun, 17 Nov 2024 22:51:08 +0800 Subject: [PATCH] update README --- README.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/README.md b/README.md index c3c7b45d..cd69a9b7 100644 --- a/README.md +++ b/README.md @@ -28,6 +28,7 @@ developers to train custom multimodal large language model (MLLM), focusing on < 6. [Citation](#citation) # News +- [Update Nov. 17, 2024] Recipes for [LLM-Based Contextual ASR](examples/contextual_asr/README.md) have been supported. - [Update Nov. 5, 2024] Recipes for [speech emotion captioning (SEC)](examples/sec_emotioncaps/README.md) with [emotion2vec](https://github.com/ddlBoJack/emotion2vec) as the encoder has been supported. - [Update Oct. 12, 2024] Recipes for [SLAM-AAC](examples/slam_aac/README.md) with [EAT](https://github.com/cwx-worst-one/EAT) as the encoder have been supported. - [Update Sep. 28, 2024] Recipes for [CoT-ST](examples/st_covost2/README.md) have been supported. @@ -84,6 +85,7 @@ We provide reference implementations of various LLM-based speech, audio, and mus - Contextual Automatic Speech Recognition (CASR) - [ Mala-ASR](examples/mala_asr_slidespeech/README.md) + - [LLM-Based Contextual ASR](examples/contextual_asr/README.md) - [Visual Speech Recognition (VSR)](examples/vsr_LRS3/README.md) - Speech-to-Text Translation (S2TT) @@ -142,6 +144,15 @@ Mala-ASR: year={2024} } ``` +LLM-Based Contextual ASR: +``` +@article{yang2024ctc, + title={CTC-Assisted LLM-Based Contextual ASR}, + author={Yang, Guanrou and Ma, Ziyang and Gao, Zhifu and Zhang, Shiliang and Chen, Xie}, + journal={Proc. SLT}, + year={2024} +} +``` CoT-ST: ``` @article{du2024cot,