Skip to content

Commit

Permalink
Merge pull request espnet#5657 from wanchichen/patch-1
Browse files Browse the repository at this point in the history
Add E-Branchformer model for FLEURS
  • Loading branch information
mergify[bot] authored Feb 16, 2024
2 parents 7ab5e42 + 77b6c0d commit d8b53fd
Show file tree
Hide file tree
Showing 2 changed files with 125 additions and 0 deletions.
31 changes: 31 additions & 0 deletions egs2/fleurs/asr1/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,37 @@ Finally, the model needs to be configured to use the data.
# RESULTS

<!-- Generated by scripts/utils/show_asr_result.sh -->

# Multilingual ASR - SSL + E-Branchformer + Self-condition [XLS-R, E-Branchformer, utt_mvn, 6500 BPE](conf/tuning/train_asr_ebf_scctc.yaml)

## Environments
- date: `Tue Feb 7 05:54:39 CST 2023`
- python version: `3.8.16 (default, Jan 17 2023, 23:13:24) [GCC 11.2.0]`
- espnet version: `espnet 202211`
- pytorch version: `pytorch 1.13.1+cu116`
- Git hash: `7f37bf7270017eede7d77701b389d1412f30078c`
- Commit date: `Sun Jan 1 13:06:01 2023 -0500`
- Pre-trained model: https://huggingface.co/espnet/wanchichen_fleurs_asr_ebf_scctc

## asr_train_asr_branchformer_scctc_raw_all_bpe6500_sp
### WER

|dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
|---|---|---|---|---|---|---|---|---|
|lm0.4_pen0.0_lm_lm_train_lm_all_bpe6500_valid.loss.ave_asr_model_valid.acc.ave/test_all|77809|1669969|75.2|22.3|2.5|3.0|27.8|95.7|

### CER

|dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
|---|---|---|---|---|---|---|---|---|
|lm0.4_pen0.0_lm_lm_train_lm_all_bpe6500_valid.loss.ave_asr_model_valid.acc.ave/test_all|77809|10235271|93.1|4.3|2.6|2.3|9.2|95.6|

### TER

|dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
|---|---|---|---|---|---|---|---|---|
|lm0.4_pen0.0_lm_lm_train_lm_all_bpe6500_valid.loss.ave_asr_model_valid.acc.ave/test_all|77809|9622352|92.2|5.1|2.6|2.4|10.2|95.5|

# Multilingual ASR - SSL + Conformer + Hierarchical LID Self-condition [XLS-R, Conformer, utt_mvn, 6500 BPE](conf/train_asr_conformer_hier_lid_utt.yaml)

## Environments
Expand Down
94 changes: 94 additions & 0 deletions egs2/fleurs/asr1/conf/tuning/train_asr_ebf_scctc.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
batch_type: numel
batch_bins: 40000000
accum_grad: 4
max_epoch: 20
patience: none
# The initialization method for model parameters
init: xavier_uniform
best_model_criterion:
- - valid
- acc
- max
keep_nbest_models: 3

encoder: e_branchformer
encoder_conf:
output_size: 512
attention_heads: 8
attention_layer_type: rel_selfattn
pos_enc_layer_type: rel_pos
rel_pos_type: latest
cgmlp_linear_units: 3072
cgmlp_conv_kernel: 31
use_linear_after_conv: false
gate_activation: identity
num_blocks: 12
dropout_rate: 0.1
positional_dropout_rate: 0.1
attention_dropout_rate: 0.1
input_layer: conv2d2
layer_drop_rate: 0.1
linear_units: 1024
positionwise_layer_type: linear
macaron_ffn: true
use_ffn: true
merge_conv_kernel: 31
interctc_layer_idx: [3, 6, 9]
interctc_use_conditioning: true

decoder: transformer
decoder_conf:
attention_heads: 8
linear_units: 2048
num_blocks: 6
dropout_rate: 0.1
positional_dropout_rate: 0.1
self_attention_dropout_rate: 0.1
src_attention_dropout_rate: 0.1

model_conf:
ctc_weight: 0.3
lsm_weight: 0.1
interctc_weight: 0.5
length_normalized_loss: false
extract_feats_in_collect_stats: false

optim: adam
optim_conf:
lr: 0.002
scheduler: warmuplr
scheduler_conf:
warmup_steps: 25000

specaug: specaug
specaug_conf:
apply_time_warp: true
time_warp_window: 5
time_warp_mode: bicubic
apply_freq_mask: true
freq_mask_width_range:
- 0
- 30
num_freq_mask: 2
apply_time_mask: true
time_mask_width_range:
- 0
- 40
num_time_mask: 2

freeze_param: [
"frontend.upstream"
]

frontend: s3prl
frontend_conf:
frontend_conf:
upstream: wav2vec2_url # Note: If the upstream is changed, please change the input_size in the preencoder.
path_or_url: https://huggingface.co/s3prl/converted_ckpts/resolve/main/xlsr2_300m.pt
download_dir: ./hub
multilayer_feature: True

preencoder: linear
preencoder_conf:
input_size: 1024 # Note: If the upstream is changed, please change this value accordingly.
output_size: 80

0 comments on commit d8b53fd

Please sign in to comment.