diff --git a/egs2/README.md b/egs2/README.md index 3044916214c..28b95808dc4 100644 --- a/egs2/README.md +++ b/egs2/README.md @@ -78,6 +78,7 @@ See: https://espnet.github.io/espnet/espnet2_tutorial.html#recipes-using-espnet2 | iwslt21_low_resource | ALFFA, IARPA Babel, Gamayun, IWSLT 2021 | ASR | SWA | http://www.openslr.org/25/ https://catalog.ldc.upenn.edu/LDC2017S05 https://gamayun.translatorswb.org/data/ https://iwslt.org/2021/low-resource | | | iwslt22_dialect | IWSLT2022 dialectal speech translation shared task | ASR/ST | ARA->Tunisian ARA | https://github.com/kevinduh/iwslt22-dialect.git | | | iwslt22_low_resource | IWSLT2022 Low-resource speech translation track task | ST | Tamasheq->FrenchPermalink | https://github.com/mzboito/IWSLT2022_Tamasheq_data.git | +| iwslt24_indic | IWSLT2024 Indic speech translation track | ST | ENG -> HIN, BEN, TAM | https://iwslt.org/2024/indic | | | jdcinal | Japanese Dialogue Corpus of Information Navigation and Attentive Listening Annotated with Extended ISO-24617-2 Dialogue Act Tags | SLU | JPN | http://www.lrec-conf.org/proceedings/lrec2018/pdf/464.pdf http://tts.speech.cs.cmu.edu/awb/infomation_navigation_and_attentive_listening_0.2.zip | | | jkac | J-KAC: Japanese Kamishibai and audiobook corpus | TTS | JPN | https://sites.google.com/site/shinnosuketakamichi/research-topics/j-kac_corpus | | | jmd | JMD: Japanese multi-dialect corpus for speech synthesis | TTS | JPN | https://sites.google.com/site/shinnosuketakamichi/research-topics/jmd_corpus | | diff --git a/egs2/TEMPLATE/asr1/db.sh b/egs2/TEMPLATE/asr1/db.sh index 5f4417b642e..6d778cf84e1 100755 --- a/egs2/TEMPLATE/asr1/db.sh +++ b/egs2/TEMPLATE/asr1/db.sh @@ -169,6 +169,7 @@ CMU_INDIC=downloads INDIC_SPEECH=downloads IWSLT22_DIALECT= IWSLT22_LOW_RESOURCE=downloads +IWSLT24_INDIC= JKAC= MUCS_SUBTASK1=downloads MUCS_SUBTASK2=downloads diff --git a/egs2/iwslt24_indic/st1/RESULTS.md b/egs2/iwslt24_indic/st1/RESULTS.md new file mode 100644 index 00000000000..f592efef06e --- /dev/null +++ b/egs2/iwslt24_indic/st1/RESULTS.md @@ -0,0 +1,67 @@ +# RESULTS + +## En-Hi + +### Environments +- date: `Thu Apr 18 01:34:53 JST 2024` +- python version: `3.10.14 (main, Mar 21 2024, 16:24:04) [GCC 11.2.0]` +- espnet version: `espnet 202402` +- pytorch version: `pytorch 2.1.0` +- Git hash: `83c179ab842987cf01642df2db372aaae260df55` + - Commit date: `Wed Apr 17 00:28:29 2024 +0900` + +### Model config + +- training: [./conf/tuning/train_st_conformer.yaml](./conf/tuning/train_st_conformer.yaml) +- decoding: [./conf/tuning/decode_st_conformer.yaml](./conf/tuning/decode_st_conformer.yaml) +- model url: [https://huggingface.co/espnet/iwslt24_indic_en_hi_bpe_tc4000](https://huggingface.co/espnet/iwslt24_indic_en_hi_bpe_tc4000) + +### BLEU + +|dataset|score|verbose_score| +|---|---|---| +|decode_st_conformer_st_model_valid.acc.ave/dev.en-hi|37.1|64.8/44.9/34.2/26.2 (BP = 0.924 ratio = 0.927 hyp_len = 195297 ref_len = 210636)| + +## En-Bn + +### Environments +- date: `Wed Apr 17 02:51:38 JST 2024` +- python version: `3.10.14 (main, Mar 21 2024, 16:24:04) [GCC 11.2.0]` +- espnet version: `espnet 202402` +- pytorch version: `pytorch 2.1.0` +- Git hash: `83c179ab842987cf01642df2db372aaae260df55` + - Commit date: `Wed Apr 17 00:28:29 2024 +0900` + +### Model config + +- training: [./conf/tuning/train_st_conformer.yaml](./conf/tuning/train_st_conformer.yaml) +- decoding: [./conf/tuning/decode_st_conformer.yaml](./conf/tuning/decode_st_conformer.yaml) +- model url: [https://huggingface.co/espnet/iwslt24_indic_en_bn_bpe_tc4000](https://huggingface.co/espnet/iwslt24_indic_en_bn_bpe_tc4000) + +### BLEU + +|dataset|score|verbose_score| +|---|---|---| +|decode_st_conformer_st_model_valid.acc.ave/dev.en-bn|2.1|19.7/3.6/1.0/0.3 (BP = 1.000 ratio = 1.185 hyp_len = 46094 ref_len = 38883)| + +# En-Ta + +## Environments +- date: `Thu Apr 18 01:03:59 JST 2024` +- python version: `3.10.14 (main, Mar 21 2024, 16:24:04) [GCC 11.2.0]` +- espnet version: `espnet 202402` +- pytorch version: `pytorch 2.1.0` +- Git hash: `83c179ab842987cf01642df2db372aaae260df55` + - Commit date: `Wed Apr 17 00:28:29 2024 +0900` + +### Model config + +- training: [./conf/tuning/train_st_conformer.yaml](./conf/tuning/train_st_conformer.yaml) +- decoding: [./conf/tuning/decode_st_conformer.yaml](./conf/tuning/decode_st_conformer.yaml) +- model url: [https://huggingface.co/espnet/iwslt24_indic_en_ta_bpe_tc4000](https://huggingface.co/espnet/iwslt24_indic_en_ta_bpe_tc4000) + +### BLEU + +|dataset|score|verbose_score| +|---|---|---| +|decode_st_conformer_st_model_valid.acc.ave/dev.en-ta|6.3|46.5/9.4/4.7/1.9 (BP = 0.798 ratio = 0.816 hyp_len = 66168 ref_len = 81059)| diff --git a/egs2/iwslt24_indic/st1/cmd.sh b/egs2/iwslt24_indic/st1/cmd.sh new file mode 100644 index 00000000000..2aae6919fef --- /dev/null +++ b/egs2/iwslt24_indic/st1/cmd.sh @@ -0,0 +1,110 @@ +# ====== About run.pl, queue.pl, slurm.pl, and ssh.pl ====== +# Usage: .pl [options] JOB=1: +# e.g. +# run.pl --mem 4G JOB=1:10 echo.JOB.log echo JOB +# +# Options: +# --time