distac

A simple distributed version of the famous TTS model tacotron for scalablility on the cloud

This project divides a pretrained Tacotron model to 4 different parts: Encoder, Decoder, Postnet and Vocoder. With the help of an orchestrator we can use this model to inference speech from text and apply load balancing to build a scalable model with less resources than we would need if we were to scale the entire model.

Building images and usage

You can build the images by executing this command for each part of the model from the root of the project.

docker build -f Dockerfiles/[model-part].Dockerfile -t [you_username]/[model-part]:latest .

It's also possible to start this repository using docker compose by running this command in the root of the repository:

docker compose up

inference:

To generate speech from the input text, considering all the services are up and running (with the orchestrator running on port 8080) you can generate speech by sending the following request:

curl -X POST http://localhost:8080/process -H "Content-Type: application/json" -d '{"text": "Only three stars are born in the milkyway each year"}' -o result.wav

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
Dockerfiles		Dockerfiles
GCloud		GCloud
Kubernetes		Kubernetes
model		model
.gitignore		.gitignore
README.md		README.md
compose.yaml		compose.yaml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

distac

Building images and usage

inference:

About

Releases

Packages

Contributors 2

Languages

sheykh-8/distac

Folders and files

Latest commit

History

Repository files navigation

distac

Building images and usage

inference:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages