How to combine melGAN with feature predictor like FastSpeech or tacotron2? #17

nikawool · 2020-02-25T06:46:06Z

FastSpeech: https://github.com/xcmyz/FastSpeech
How can I combine melGAN with feature predictor like FastSpeech or tacotron2?

Liujingxiu23 · 2020-04-20T01:54:41Z

Have you tried Fastspeech combined with melgan? How is the result?

Teravus · 2020-09-29T03:56:39Z

I've been playing with Tacotron2's inference notebook.. but so far just noise for me.
I copied the mel2wav folder and my checkpoint log directory to the tacotron2 directory
I end up adding a section after the RemoveWaveGlow bias section of the notebook.

vocoder = MelVocoder(path="logs/baseline14k/",model_name="best_netG")
recons = vocoder.inverse(mel_outputs.float()).squeeze().cpu().numpy()
ipd.Audio(recons , rate=22050)

I've also tried;

vocoder = MelVocoder(path="logs/baseline14k/",model_name="best_netG")

recons = vocoder.inverse(mel_outputs.float()).squeeze().cpu().numpy()

meldata = mel_outputs.float()
meldata.shape
torch.Size([1, 80, 503])
rev_wav = vocoder.inverse(meldata.float())#.squeeze().cpu().numpy()
rev_wav.shape
torch.Size([1, 128768])
rev_wav.dtype
torch.float32
rev_wav2 = rev_wav.cpu().numpy()
rev_wav2.shape
(1, 128768)
ipd.Audio((rev_wav2.reshape((-1))*2**15).astype(np.int16), rate=22050)

Same results.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to combine melGAN with feature predictor like FastSpeech or tacotron2? #17

How to combine melGAN with feature predictor like FastSpeech or tacotron2? #17

nikawool commented Feb 25, 2020

Liujingxiu23 commented Apr 20, 2020

Teravus commented Sep 29, 2020

How to combine melGAN with feature predictor like FastSpeech or tacotron2? #17

How to combine melGAN with feature predictor like FastSpeech or tacotron2? #17

Comments

nikawool commented Feb 25, 2020

Liujingxiu23 commented Apr 20, 2020

Teravus commented Sep 29, 2020

I've been playing with Tacotron2's inference notebook.. but so far just noise for me. I copied the mel2wav folder and my checkpoint log directory to the tacotron2 directory I end up adding a section after the RemoveWaveGlow bias section of the notebook.

I've also tried;

recons = vocoder.inverse(mel_outputs.float()).squeeze().cpu().numpy()

I've been playing with Tacotron2's inference notebook.. but so far just noise for me.
I copied the mel2wav folder and my checkpoint log directory to the tacotron2 directory
I end up adding a section after the RemoveWaveGlow bias section of the notebook.