Skip to content

Commit

Permalink
Improve Readme, make consolidation to Flashlight clearer (#929)
Browse files Browse the repository at this point in the history
Summary:
Some people still seem to think wav2letter is being actively maintained — make this clearer and update the readme.

Pull Request resolved: #929

Test Plan: visual inspection

Reviewed By: xuqiantong

Differential Revision: D25715232

Pulled By: jacobkahn

fbshipit-source-id: a406ced626d67132476751c85bde4e94c45c50be
  • Loading branch information
jacobkahn authored and facebook-github-bot committed Dec 28, 2020
1 parent 9747f42 commit 64e54f8
Showing 1 changed file with 19 additions and 15 deletions.
34 changes: 19 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,38 +3,42 @@
[![CircleCI](https://circleci.com/gh/facebookresearch/wav2letter.svg?style=svg)](https://circleci.com/gh/facebookresearch/wav2letter)
[![Join the chat at https://gitter.im/wav2letter/community](https://badges.gitter.im/wav2letter/community.svg)](https://gitter.im/wav2letter/community?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)

wav2letter++ is a [highly efficient](https://arxiv.org/abs/1812.07625) end-to-end automatic speech recognition (ASR) toolkit written entirely in C++, leveraging [ArrayFire](https://github.com/arrayfire/arrayfire) and [flashlight](https://github.com/facebookresearch/flashlight).
## Important Note:
### wav2letter has been moved and consolidated [into Flashlight](https://github.com/facebookresearch/flashlight) in the [ASR application](https://github.com/facebookresearch/flashlight/tree/master/flashlight/app/asr).

The toolkit started from models predicting letters directly from the raw waveform, and now evolved as an all-purpose end-to-end ASR research toolkit, supporting a wide range of models and learning techniques. It also embarks a very efficient modular beam-search decoder, for both structured learning (CTC, ASG) and seq2seq approaches.
Future wav2letter development will occur in Flashlight.

**Important disclaimer**: as a number of models from this repository could be used for other modalities, we moved most of the code to flashlight.
*To build the old, pre-consolidation version of wav2letter*, checkout the [wav2letter v0.2](https://github.com/facebookresearch/wav2letter/releases/tag/v0.2) release, which depends on the old [Flashlight v0.2](https://github.com/facebookresearch/flashlight/releases/tag/v0.2) release. The [`wav2letter-lua`](https://github.com/facebookresearch/wav2letter/tree/wav2letter-lua) project can be fonud on the `wav2letter-lua` branch, accordingly.

For more information on wav2letter++, see or cite [this arXiv paper](https://arxiv.org/abs/1812.07625).

## Recipes
This repository includes recipes to reproduce the following research papers as well as **pre-trained** models:
- [NEW] [Pratap et al. (2020): Scaling Online Speech Recognition Using ConvNets](recipes/streaming_convnets/)
- [NEW SOTA] [Synnaeve et al. (2020): End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures](recipes/sota/2019)
- [Pratap et al. (2020): Scaling Online Speech Recognition Using ConvNets](recipes/streaming_convnets/)
- [Synnaeve et al. (2020): End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures](recipes/sota/2019)
- [Kahn et al. (2020): Self-Training for End-to-End Speech Recognition](recipes/self_training)
- [Likhomanenko et al. (2019): Who Needs Words? Lexicon-free Speech Recognition](recipes/lexicon_free/)
- [Hannun et al. (2019): Sequence-to-Sequence Speech Recognition with Time-Depth Separable Convolutions](recipes/seq2seq_tds/)

Data preparation for our training and evaluation can be found in [data](data) folder.
Data preparation for training and evaluation can be found in [data](data) directory.

The previous iteration of wav2letter can be found in the:
- (before merging codebases for wav2letter and flashlight) [wav2letter-v0.2](https://github.com/facebookresearch/wav2letter/tree/v0.2) branch.
- (written in Lua) [`wav2letter-lua`](https://github.com/facebookresearch/wav2letter/tree/wav2letter-lua) branch.
### Building the Recipes

## Build recipes
First, isntall [flashlight](https://github.com/facebookresearch/flashlight) with all its dependencies. Then
First, install [Flashlight](https://github.com/facebookresearch/flashlight) with the [ASR application](https://github.com/facebookresearch/flashlight/tree/master/flashlight/app/asr). Then, after cloning the project source:
```shell
mkdir build && cd build
cmake .. && make -j8
```
mkdir build && cd build && cmake .. && make -j8
If Flashlight or ArrayFire are installed in nonstandard paths via a custom `CMAKE_INSTALL_PREFIX`, they can be found by passing
```shell
-Dflashlight_DIR=[PREFIX]/usr/share/flashlight/cmake/ -DArrayFire_DIR=[PREFIX]/usr/share/ArrayFire/cmake
```
If flashlight or ArrayFire are installed in nonstandard paths via `CMAKE_INSTALL_PREFIX`, they can be found by passing `-Dflashlight_DIR=[PREFIX]/usr/share/flashlight/cmake/ -DArrayFire_DIR=[PREFIX]/usr/share/ArrayFire/cmake` when running `cmake`.
when running `cmake`.

## Join the wav2letter community
* Facebook page: https://www.facebook.com/groups/717232008481207/
* Google group: https://groups.google.com/forum/#!forum/wav2letter-users
* Contact: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]

See the [CONTRIBUTING](CONTRIBUTING.md) file for how to help out.

## License
wav2letter++ is BSD-licensed, as found in the [LICENSE](LICENSE) file.

0 comments on commit 64e54f8

Please sign in to comment.