GitHub - liuhuang31/hifigan-sr: hifigan super-resolution 16/24 kHz to 48 kHz

Pre-requisites

Python == 3.8
Clone this repository.
Install python requirements. Please refer requirements.txt
Download a 48k dataset, such as genshin or VCTK.

SR model sample theory

SR results

origin 16k mel-spectrum
generated 48k mel-spectrum

Training

CUDA_VISIBLE_DEVICES="3" python train.py \
--config config_v2_16k_to_48k.json \
--input_wavs_dir VCTK-Corpus/wav48/,genshin --checkpoint_path exp/v2_16k_to_48k/

To train 24k_to_48k, replace config_v2_16k_to_48k.json with config_v2_24k_to_48k.json.
Checkpoints and copy of the configuration file are saved in checkpoint_path directory by default.
You can change the path by adding --checkpoint_path option.

The hifigan training speed may be a bit slow. You can use HiFTNet-sr to accelerate training, and its generate result may be sounds better.

Pretrained Model

The pretrained models provided is in "exp/v2_16k_to_48k/g_00120000", trained with StarRail_Datasets and VCTK.
For i don't have GPU resources, a kind person(@Lucy) train config_v2_16k_to_48k version and trained stop at 120k, not trained enough. So the results maybe sounds a little electronic, but verified that model methods is valid.
You can use other hifigan config version to train, or use HiFTNet-sr.

Inference from wav file

Make test_files directory and copy wav files into the directory.

Run the following command.

# python inference.py --checkpoint_file [generator checkpoint file path]
CUDA_VISIBLE_DEVICES="-1" python inference.py --checkpoint_file exp/v1_16k_to_48k/g_00120000 --input_wavs_dir LJSpeech-BZNSYP-16k

Generated wav files are saved in generated_files by default.
You can change the path by adding --output_dir option.

Reference

Our repository is heavily based on jik876 's hifi-gan.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
LJSpeech-BZNSYP-16k		LJSpeech-BZNSYP-16k
exp/v2_16k_to_48k		exp/v2_16k_to_48k
generated_files		generated_files
images		images
LICENSE		LICENSE
README.md		README.md
config_v1.json		config_v1.json
config_v2.json		config_v2.json
config_v2_16k_to_48k.json		config_v2_16k_to_48k.json
config_v2_24k_to_48k.json		config_v2_24k_to_48k.json
config_v3.json		config_v3.json
env.py		env.py
inference.py		inference.py
inference_e2e.py		inference_e2e.py
meldataset.py		meldataset.py
models.py		models.py
requirements.txt		requirements.txt
train.py		train.py
utils.py		utils.py
validation_loss.png		validation_loss.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pre-requisites

SR model sample theory

SR results

Training

Pretrained Model

Inference from wav file

Reference

About

Releases

Packages

Languages

License

liuhuang31/hifigan-sr

Folders and files

Latest commit

History

Repository files navigation

Pre-requisites

SR model sample theory

SR results

Training

Pretrained Model

Inference from wav file

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages