Roboy Soncreo (from Lat. sonus - sound and creō - I create, make, produce) - a library for Speech Generation based on Deep Learning models.
A pytorch implementaton that combines Tacotron2 and NV-Wavenet to provide audio synthesis from text. It also supports interfacing using ROS2 (not implemented yet)
- NVIDIA GPU + CUDA cuDNN
- Pytorch 1.0
- Clone this repo:
git clone https://github.com/Roboy/soncreo
- Initialize submodules:
git submodule init; git submodule update
- Download and extract the LJ Speech dataset
cd nv-wavenet\pytorch
.- Update the
Makefile
with the appropriateARCH=sm_70
. Find your ARCH here: https://developer.nvidia.com/cuda-gpus. For example, NVIDIA Titan V has 7.0 compute capability; therefore, it's correctARCH
parameter issm_70
. - Build nv-wavenet and C-wrapper:
make
- Install the PyTorch extension:
python build.py install
cd tacotron2
and then update .wav paths:sed -i -- 's,DUMMY,ljs_dataset_folder/wavs,g' filelists/*.txt
- cd into parent Soncreo directory
cd ..
python interface.py --output_directory=output --log_directory=logdir
- (OPTIONAL)
tensorboard --logdir=outdir/logdir
Make a list of the file names to use for training/testing
ls ljs_datset_folder/*.wav | tail -n+10 > train_files.txt
ls ljs_dataset_folder/*.wav | head -n10 > test_files.txt
Train the model
python interface_wavenet.py -c nv-wavenet/pytorch/config.json
python combine.py --default=False --text='Write your text here' --checkpoint_tac='checkpoint/tac' --checkpoint_wav='checkpoints/wav' --batch=1 output_directory='./output --implementation="persistent"
- Download pretrained models here
- Create a folder named checkpoint and copy tacotron2 and wavenet pretrained models:
mkdir checkpoints
- Create a folder called output (used to save the produced wav file:
mkdir outputs
- Run the following command:
python combine.py --default=True --text="Write your text here"
This repo contains a ROS2 Server (rospy client library) allows a ROS2 node to communicate.
- Starting the ros service:
python3 TTS_srv.py
- Call the service via a client (simple example client for Roboy is Pyroboy)