diff --git a/_posts/2024-01-01-owsm.md b/_posts/2024-01-01-owsm.md
index 5170ce9..ccbce19 100644
--- a/_posts/2024-01-01-owsm.md
+++ b/_posts/2024-01-01-owsm.md
@@ -17,7 +17,7 @@ comments: false
## Pre-trained models
-We publicly release a series of pre-trained models. The training logs are also available for major models. We recommend using OWSM v3.1 or later versions for better performance and efficiency.
+We publicly release a series of [pre-trained models](https://huggingface.co/collections/pyf98/open-whisper-style-speech-models-owsm-66d5312c1c9a1508189192cd). The training logs are also available for major models. We recommend using OWSM v3.1 or later versions for better performance and efficiency.
@@ -76,7 +76,7 @@ We publicly release a series of pre-trained models. The training logs are also a
180k |
E-Branchformer |
367M |
- Coming soon |
+ espnet/owsm_v3.1_ebf_small |
egs2/owsm_v3.1/s2t1 |
@@ -88,11 +88,27 @@ We publicly release a series of pre-trained models. The training logs are also a
egs2/owsm_v3.1/s2t1 |
- OWSM v3.1 medium license-free |
+ OWSM v3.1 small low-restriction |
70k |
E-Branchformer |
- 1.02B |
- Coming soon |
+ 367M |
+ espnet/owsm_v3.1_ebf_small_lowrestriction |
+ egs2/owsm_v3.1/s2t1 |
+
+
+ OWSM-CTC v3.1 medium |
+ 180k |
+ E-Branchformer |
+ 1.01B |
+ pyf98/owsm_ctc_v3.1_1B |
+ Check model page |
+
+
+ OWSM v3.2 small |
+ 180k |
+ E-Branchformer |
+ 367M |
+ espnet/owsm_v3.2 |
Coming soon |
@@ -133,9 +149,9 @@ The latest OWSM v3.1 models are trained on a diverse combination of public datas
-The license-free model is trained on a subset of the above data with "free licenses".
+The low-restriction model is trained on a subset of the above data with "more flexible licenses".
-OWSM v3.1 license-free data
+OWSM v3.1 low-restriction data
- AMI: CC-BY-4.0
- Common Voice: CC0-1.0
@@ -240,14 +256,16 @@ result = s2t.decode_long(speech)
## Fine-tuning on custom data
-Coming soon!
+Our latest work (accepted to SLT 2024), "ESPnet-EZ: Python-only ESPnet for Easy Fine-tuning and Integration", will provide an easier way for fine-tuning pre-trained models. We are preparing demos and notebooks. Please stay tuned!
## Papers
-Please cite our papers if you use OWSM in your project.
+Please cite our papers if you find OWSM helpful.
-- Preprint: [OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer](https://arxiv.org/abs/2401.16658)
+- ACL 2024: [OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification](https://aclanthology.org/2024.acl-long.549/)
+- INTERSPEECH 2024: [On the Effects of Heterogeneous Data Sources on Speech-to-Text Foundation Models](https://arxiv.org/abs/2406.09282)
+- INTERSPEECH 2024: [OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer](https://arxiv.org/abs/2401.16658)
- ASRU 2023: [Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data](https://arxiv.org/abs/2309.13876)