wav2kana

wav2letter model that outputs the hiragana or katakana transcript of a Japanese language utterance

Dataset:

A simple dataset to train and test the model is the KORE and Tatoeba datasets. These datasets are usually used to create material to learn Japanese. However, since Japanese voice recordings with their transcripts are difficult to obtain, they constitute a good start to learn Japanese voice recognition.

Here is a list of datatests you may consider using.

Environment:

Python 3.7+ PyTorch

sudo apt install ffmpeg

Prepare the data

python prepare_data.py -d [DATA_DIR] -r kore_words.csv -a kore-sound-vocab-munged

python prepare_data.py -d [DATA_DIR] -r kore_sentences.csv -a kore-sound-sentences-munged

python prepare_data.py -d [DATA_DIR] -r tatoeba.csv -a tatoeba_audio

Run the training

Examples

python train.py -s1 [DATA_DIR1] -r1 .7

python train.py -s1 [DATA_DIR1] -r1 1.0 -s2 [DATA_DIR2] -r2 .7 -a 280000 -t 39

python train.py -s1 [DATA_DIR1] -r1 1.0 -s2 [DATA_DIR2] -r2 .7 -a 120000 -t 23

python train.py -s1 [DATA_DIR1] -r1 1.0 -s2 [DATA_DIR2] -r2 .9 -s3 [DATA_DIR3] -r3 .3 -a 280000 -t 39

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
kitchen		kitchen
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
autotrain.py		autotrain.py
explore.ipynb		explore.ipynb
models.py		models.py
prepare_data.py		prepare_data.py
train.py		train.py
utils.py		utils.py
val_results.py		val_results.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

wav2kana

Dataset:

Environment:

Prepare the data

Run the training

Examples

About

Releases

Packages

Contributors 2

Languages

License

aoussou/wav2kana

Folders and files

Latest commit

History

Repository files navigation

wav2kana

Dataset:

Environment:

Prepare the data

Run the training

Examples

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages