Skip to content

Latest commit

 

History

History
19 lines (11 loc) · 1.24 KB

README.md

File metadata and controls

19 lines (11 loc) · 1.24 KB

Amphion Visualization Recipe

Quick Start

We provides a beginner recipe to demonstrate how to implement interactive visualization for classic audio, music and speech generative models. Specifically, it is also an official implementation of the paper "SingVisio: Visual Analytics of the Diffusion Model for Singing Voice Conversion", which can be accessed via arXiv or Computers & Graphics. The SingVisio can be experienced here.

Supported Models

As the unique feature of Amphion, visualization aims to introduce interactive visual analysis of some classical models for educational purposes, helping newcomers understand their inner workings.

Until now, Amphion has supported the visualization tool for the following models:

  • SVC:
  • TTS:
    • FastSpeech 2 (👨‍💻 developing): A typical transformer-based TTS model.
    • VITS (👨‍💻 developing): A typical flow-based end-to-end TTS model.