Skip to content

A collection of Persian text-to-speech models using implementations and techniques.

Notifications You must be signed in to change notification settings

Adibian/Persian-TTS-Zoo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 

Repository files navigation

Persian-TTS-Zoo

Welcome to Persian-TTS-Zoo, a collection of my Persian Text-to-Speech models adapted from popular architectures to work with the Persian language. This repository provides links to individual TTS models, each with unique features and capabilities, all developed and trained by me.

Table of Contents


Persian-Tacotron2

  • Description: Tacotron2 is an end-to-end TTS model that synthesizes natural-sounding speech from text. It has been adapted for Persian, maintaining the smooth, natural prosody of the original while handling unique Persian phonemes.
  • Features: Provides single-speaker synthesis with high-quality and expressive speech, trained on Persian datasets for natural intonation.
  • Repository: Persian-Tacotron2 Repository

Persian-MultiSpeaker-Tacotron2

  • Description: An extension of Tacotron2 for multi-speaker TTS, trained with a diverse Persian speaker dataset, allowing the model to generate voices with varying speaker characteristics.
  • Features: Multi-speaker synthesis with speaker embedding support for customizable voice generation.
  • Repository: Persian-MultiSpeaker-Tacotron2 Repository

Persian-FastSpeech2

  • Description: FastSpeech2 improves upon Tacotron by generating speech faster and more robustly. This model has been adapted for Persian to produce high-quality TTS output with faster inference times.
  • Features: Faster, stable synthesis suitable for real-time applications, with support for multi-speaker and single-speaker configurations.
  • Repository: Persian-FastSpeech2 Repository

Persian-FastPitch

  • Description: FastPitch is a pitch-controllable variant of FastSpeech. This version is adapted for Persian, allowing precise control over intonation, pitch, and speaking style.
  • Features: Pitch control, improved stability, and high-quality synthesis with efficient inference.
  • Repository: Persian-FastPitch Repository

Persian-AdaSpeech2

  • Description: AdaSpeech2 is an adaptive TTS model that customizes the speech synthesis for different speaker styles and prosody. This version has been tailored for Persian language, focusing on adaptability and nuanced speech generation.
  • Features: Flexible voice style adaptation, suited for generating unique voices or imitating particular speech patterns.
  • Repository: Persian-AdaSpeech Repository

Persian-GST-Tacotron

  • Description: GST-Tacotron is a Tacotron variant with Global Style Tokens, allowing style control in speech generation. The Persian-GST-Tacotron enables different speaking styles within Persian, such as formal and casual tones.
  • Features: Style control, expressive speech generation, supports single and multi-speaker synthesis.
  • Repository: Persian-GST-Tacotron Repository

Persian-HiFiGAN

  • Description: HiFiGAN is a GAN-based vocoder for high-quality waveform generation from mel-spectrograms. The Persian-HiFiGAN version is trained to convert Persian TTS spectrograms into realistic audio.
  • Features: High-quality, natural-sounding audio output from spectrograms; supports single and multi-speaker TTS.
  • Repositories:

License

This repository links to various open-source projects; check individual repositories for their respective licenses.

About

A collection of Persian text-to-speech models using implementations and techniques.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published