From b22c06ea55d5e42783505a358f41fe5582c45ca2 Mon Sep 17 00:00:00 2001 From: ftshijt Date: Thu, 25 Jan 2024 11:02:48 -0500 Subject: [PATCH] add tables --- ...eech2024-Discrete-Speech-Unit-Challenge.md | 82 +++++++++++++++++-- 1 file changed, 76 insertions(+), 6 deletions(-) diff --git a/_posts/2024-01-19-Interspeech2024-Discrete-Speech-Unit-Challenge.md b/_posts/2024-01-19-Interspeech2024-Discrete-Speech-Unit-Challenge.md index bd36d9d9..9f9a484a 100644 --- a/_posts/2024-01-19-Interspeech2024-Discrete-Speech-Unit-Challenge.md +++ b/_posts/2024-01-19-Interspeech2024-Discrete-Speech-Unit-Challenge.md @@ -36,16 +36,86 @@ Participation is open to all. Each team can participate in any task. This challe * Results * WER is computed on English test sets (dev-clean / dev-other / test-clean / test-other) * CER is computed on the multi-lingual test set (test_1h) - * Wavlm-large-layer21 results: - * Librispeech: dev-clean (4.5), dev-other (8.1), test-clean (4.4), test-other (8.3) - * ML-SUPERB: test_1h (72.6) + + + + + + + + + + + + + + + + + + + + + +
Modeldev-clean (LS)dev-other (LS)test-clean (LS)test-other (LS)test-1h (ML-SUPERB)
Wavlm-large-layer214.58.14.48.372.6
- [Text-to-speech (TTS)](https://github.com/espnet/espnet/tree/tts2/egs2/ljspeech/tts2) * Results - * Full LJSpeech with HuBERT-large units: MCD (7.19), F0 RMSE (0.26), WER (8.1), UTMOS (3.73) + + + + + + + + + + + + + + + + + + + +
ModelMCDLog F0 RMSEWERUTMOS
HuBERT-base-layer67.190.268.13.73
- [Singing voice synthesis (SVS)](https://github.com/A-Quarter-Mile/espnet/tree/tmp_muskit/egs2/opencpop/svs2) - * Opencpop with WavLM-large units: MCD (8.47), F0 RMSE (0.18) + + + + + + + + + + + + + + + +
ModelMCDLog F0 RMSE
WavLM-large-layer68.470.18
- [Discrete vocoder training](https://github.com/kan-bayashi/ParallelWaveGAN) - * Expresso with HuBERT-large units: MCD (8.37), F0 RMSE (0.34), UTMOS (3.65) + + + + + + + + + + + + + + + + + +
ModelMCDLog F0 RMSEUTMOS
HuBERT-base-layer68.370.343.65
### Track-specific dataset