-
Notifications
You must be signed in to change notification settings - Fork 1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Improve Readme, make consolidation to Flashlight clearer (#929)
Summary: Some people still seem to think wav2letter is being actively maintained — make this clearer and update the readme. Pull Request resolved: #929 Test Plan: visual inspection Reviewed By: xuqiantong Differential Revision: D25715232 Pulled By: jacobkahn fbshipit-source-id: a406ced626d67132476751c85bde4e94c45c50be
- Loading branch information
1 parent
9747f42
commit 64e54f8
Showing
1 changed file
with
19 additions
and
15 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,38 +3,42 @@ | |
[![CircleCI](https://circleci.com/gh/facebookresearch/wav2letter.svg?style=svg)](https://circleci.com/gh/facebookresearch/wav2letter) | ||
[![Join the chat at https://gitter.im/wav2letter/community](https://badges.gitter.im/wav2letter/community.svg)](https://gitter.im/wav2letter/community?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) | ||
|
||
wav2letter++ is a [highly efficient](https://arxiv.org/abs/1812.07625) end-to-end automatic speech recognition (ASR) toolkit written entirely in C++, leveraging [ArrayFire](https://github.com/arrayfire/arrayfire) and [flashlight](https://github.com/facebookresearch/flashlight). | ||
## Important Note: | ||
### wav2letter has been moved and consolidated [into Flashlight](https://github.com/facebookresearch/flashlight) in the [ASR application](https://github.com/facebookresearch/flashlight/tree/master/flashlight/app/asr). | ||
|
||
The toolkit started from models predicting letters directly from the raw waveform, and now evolved as an all-purpose end-to-end ASR research toolkit, supporting a wide range of models and learning techniques. It also embarks a very efficient modular beam-search decoder, for both structured learning (CTC, ASG) and seq2seq approaches. | ||
Future wav2letter development will occur in Flashlight. | ||
|
||
**Important disclaimer**: as a number of models from this repository could be used for other modalities, we moved most of the code to flashlight. | ||
*To build the old, pre-consolidation version of wav2letter*, checkout the [wav2letter v0.2](https://github.com/facebookresearch/wav2letter/releases/tag/v0.2) release, which depends on the old [Flashlight v0.2](https://github.com/facebookresearch/flashlight/releases/tag/v0.2) release. The [`wav2letter-lua`](https://github.com/facebookresearch/wav2letter/tree/wav2letter-lua) project can be fonud on the `wav2letter-lua` branch, accordingly. | ||
|
||
For more information on wav2letter++, see or cite [this arXiv paper](https://arxiv.org/abs/1812.07625). | ||
|
||
## Recipes | ||
This repository includes recipes to reproduce the following research papers as well as **pre-trained** models: | ||
- [NEW] [Pratap et al. (2020): Scaling Online Speech Recognition Using ConvNets](recipes/streaming_convnets/) | ||
- [NEW SOTA] [Synnaeve et al. (2020): End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures](recipes/sota/2019) | ||
- [Pratap et al. (2020): Scaling Online Speech Recognition Using ConvNets](recipes/streaming_convnets/) | ||
- [Synnaeve et al. (2020): End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures](recipes/sota/2019) | ||
- [Kahn et al. (2020): Self-Training for End-to-End Speech Recognition](recipes/self_training) | ||
- [Likhomanenko et al. (2019): Who Needs Words? Lexicon-free Speech Recognition](recipes/lexicon_free/) | ||
- [Hannun et al. (2019): Sequence-to-Sequence Speech Recognition with Time-Depth Separable Convolutions](recipes/seq2seq_tds/) | ||
|
||
Data preparation for our training and evaluation can be found in [data](data) folder. | ||
Data preparation for training and evaluation can be found in [data](data) directory. | ||
|
||
The previous iteration of wav2letter can be found in the: | ||
- (before merging codebases for wav2letter and flashlight) [wav2letter-v0.2](https://github.com/facebookresearch/wav2letter/tree/v0.2) branch. | ||
- (written in Lua) [`wav2letter-lua`](https://github.com/facebookresearch/wav2letter/tree/wav2letter-lua) branch. | ||
### Building the Recipes | ||
|
||
## Build recipes | ||
First, isntall [flashlight](https://github.com/facebookresearch/flashlight) with all its dependencies. Then | ||
First, install [Flashlight](https://github.com/facebookresearch/flashlight) with the [ASR application](https://github.com/facebookresearch/flashlight/tree/master/flashlight/app/asr). Then, after cloning the project source: | ||
```shell | ||
mkdir build && cd build | ||
cmake .. && make -j8 | ||
``` | ||
mkdir build && cd build && cmake .. && make -j8 | ||
If Flashlight or ArrayFire are installed in nonstandard paths via a custom `CMAKE_INSTALL_PREFIX`, they can be found by passing | ||
```shell | ||
-Dflashlight_DIR=[PREFIX]/usr/share/flashlight/cmake/ -DArrayFire_DIR=[PREFIX]/usr/share/ArrayFire/cmake | ||
``` | ||
If flashlight or ArrayFire are installed in nonstandard paths via `CMAKE_INSTALL_PREFIX`, they can be found by passing `-Dflashlight_DIR=[PREFIX]/usr/share/flashlight/cmake/ -DArrayFire_DIR=[PREFIX]/usr/share/ArrayFire/cmake` when running `cmake`. | ||
when running `cmake`. | ||
|
||
## Join the wav2letter community | ||
* Facebook page: https://www.facebook.com/groups/717232008481207/ | ||
* Google group: https://groups.google.com/forum/#!forum/wav2letter-users | ||
* Contact: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected] | ||
|
||
See the [CONTRIBUTING](CONTRIBUTING.md) file for how to help out. | ||
|
||
## License | ||
wav2letter++ is BSD-licensed, as found in the [LICENSE](LICENSE) file. |