Skip to content

Commit

Permalink
rasr release: add decoding results
Browse files Browse the repository at this point in the history
Summary:
- add links
- add decoding results (except swb for now)
- add other ddecoder params

Reviewed By: xuqiantong

Differential Revision: D25700435

fbshipit-source-id: 0ebd104eae8c4bd1d773e04dfaf9a6ab46901ad7
  • Loading branch information
Tatiana Likhomanenko authored and facebook-github-bot committed Dec 24, 2020
1 parent 2f86068 commit 9747f42
Showing 1 changed file with 16 additions and 15 deletions.
31 changes: 16 additions & 15 deletions recipes/rasr/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,14 @@ This is a repository sharing pre-trained acoustic models and language models for

## Dependencies

* [flashlight](https://github.com/facebookresearch/flashlight)
* [`Flashlight`](https://github.com/facebookresearch/flashlight)
* [`Flashlight` ASR app](https://github.com/facebookresearch/flashlight/tree/master/flashlight/app/asr)

## Models

### Acoustic Model

All the acoustic models are retrained using flashlight with [wav2letter++](https://github.com/facebookresearch/wav2letter) consolidated. `Tedlium` is not used as training data here due to license issue. All the training data has more standardized sample rate 16kHz rather than 8kHz used in the paper.
All the acoustic models are retrained using `Flashlight` with [wav2letter++](https://github.com/facebookresearch/wav2letter) consolidated. `Tedlium` is not used as training data here due to license issue. All the training data has more standardized sample rate 16kHz rather than 8kHz used in the paper.

Here, we are releasing models with different architecture and different sizes. Note that the models may not fully reproduce results in the paper because of both data and toolkit implementation discrepancies.

Expand Down Expand Up @@ -44,32 +45,32 @@ The perplexities of the LMs on different development sets are listed below.

### WER

Here we summarize the decoding WER for all releasing models. All the numbers in the table are in format `viterbi WER -> beam search WER`.
Here we summarize the decoding WER for all releasing models. All the numbers in the table are in format `viterbi WER -> beam search WER (small beam/large beam)`.

|Achitecture |# Param |nov92 |TL-test |CV-test |LS-test-clean |LS-test-other |Hub05-SWB |Hub05-CH |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|Transformer |300 mil |3.4 → 2.9 |7.6 → 5.5 |15.5 → 11.6 |3.0 → 3.2 |7.2 → 6.4 |6.8 |11.6 |
|Transformer |70 mil |4.5 |9.4 |19.8 |4 |9.7 |7.5 |13 |
|Conformer |300 mil |3.5 |8.4 |17 |3.2 |8 |7 |11.9 |
|Conformer |87 mil |4.3 |8.7 |18.2 |3.7 |8.6 |7.3 |12.2 |
|Conformer |28 mil |5 |10.5 |22.2 |4.7 |11.1 |8.8 |13.7 |
|Transformer |300 mil |3.4 → 2.9/2.9 |7.6 → 5.5/5.4 |15.5 → 11.6/11.2 |3.0 → 3.2/3.2 |7.2 → 6.4/6.4 |6.8 → 6.2/6.2 |11.6 → 10.8/10.7 |
|Transformer |70 mil |4.5 → 3.7/3.5 |9.4 → 6.2/6.1 |19.8 →13.8/13.0 |4 → 3.6 /3.6 |9.7 → 7.7/7.5 |7.5 → 6.6/6.5 |13 → 11.8/11.7 |
|Conformer |300 mil |3.5 → 3.3/3.3 |8.4 → 6.2/6.0 |17 → 12.7/12.0 |3.2 → 3.4/3.4 |8 → 7/6.8 |7 → 6.4/6.5 |11.9 → 10.7/10.5 |
|Conformer |87 mil |4.3 → 3.3/3.3 |8.7 → 6.1/5.9 |18.2 →13.1/12.4 |3.7 → 3.5/3.5 |8.6 → 7.4/7.2 |7.3 → 6.7/6.7 |12.2 → 11.7/11.5 |
|Conformer |28 mil |5 → 3.9/3.8 |10.5 → 6.9/6.6 |22.2 → 15.4/14.4 |4.7 → 4/3.9 |11.1 → 8.9/8.6 |8.8 → 7.8/7.7 |13.7 → 12.4/12.2 |

Decoding is done with lexicon-based beam-search decoder using 200k common crawl lexicon and small common crawl lm.
* [tokens](https://[dl.fbaipublicfiles.com/wav2letter/rasr/tutorial/tokens.txt](http://dl.fbaipublicfiles.com/wav2letter/rasr/tutorial/tokens.txt))
* [inference lexicon](https://dl.fbaipublicfiles.com/wav2letter/rasr/tutorial/lexicon.txt)
* Decoding parameters:
* Decoding parameters (`beamthreshold=100, beamsizetoken=30`):

|Achitecture |# Param |LM Weight |Word Score |Beam Size |
| :---: | :---: | :---: | :---: | :---: |
|Transformer |300 mil |1.5 |0 |50 |
|Transformer |70 mil | | | |
|Conformer |300 mil | | | |
|Conformer |87 mil | | | |
|Conformer |28 mil |2 |0 |50 |
|Transformer |300 mil |1.5 |0 |50/500 |
|Transformer |70 mil |1.7 |0 |50/500 |
|Conformer |300 mil |1.8 |2 |50/500 |
|Conformer |87 mil |2 |0 |50/500 |
|Conformer |28 mil |2 |0 |50/500 |

## Tutorial

To simply serialize all the model and interact with them, please refer to the Flashlight tutorials as in [here](https://github.com/facebookresearch/flashlight/tree/master/flashlight/app/asr/tutorial).
To simply serialize all the models and interact with them, please refer to the [`Flashlight` ASR app tutorials](https://github.com/facebookresearch/flashlight/tree/master/flashlight/app/asr/tutorial).



Expand Down

0 comments on commit 9747f42

Please sign in to comment.