-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training multiple models #38
Comments
Editing wav_path in hparams.py turned out to be the key. I ran into another problem, however. The training scripts refuse to run at all using my new datasets. For the new ones, their respective versions of train_dataset.pkl only contains the following: |
Got past that error message by setting all wav files to 16 bits and 22050 Hz, but I ran into another error:
The generated train_dataset.pkl, val_dataset.pkl and text_dict.pkl files don't have line breaks at all. |
Hi, did you also change the data path in hparams? Because otherwise it would probably mix two datasets. The error message indicates that there is no training file to be loaded. I would double check whether the wav file names match the ids in the metafile.csv (if you run the preprocess.py it should say something about how many files are used). As for the other questions:
You can switch by using the --hp_file and --tts_weights flag for the corresponding models. If your models differ in hyperparams you would need to save the different hparams.py files somewhere. If the hparams are the same just setting the --tts_weights to the ***_weights.pyt model should be enough.
No.
If you don't change the tts_model_id in hparams.py it is going to resume training the previous model, otherwise it creates a new directory with the new tts_model_id under checkpoints. |
Just checked the binary pickled files. The training data for my first custom dataset is an empty array, while the training data for my second custom dataset, as well as the value dataset and text dictionary in both datasets, look normal (an array of tuples containing a filename and a three digit number for the value data and the normally-generated training data, and a massive object that paired each filename with the IPA equivalent of the text transcripts in the text dictionary). Meanwhile, the mel, quant, and raw_pitch folders each has one .npy file for every wav file in the dataset, while the phon_pitch folder for both datasets are empty. |
In this case it seems to me that there is a mismatch of text ids and wav file names, because it is only taking into account files that are matching. Did you check this? I.e. you could debug in the preprocess.py file and check how many files are filtered at lint 86. The stemmed wav file names should match the id in the metafile (e.g. 00001|some text. corresponds to 00001.wav) |
When running preprocess.py, there's no mismatch. The number of files found equals the number of indexed files. However, when I added more clips to the two datasets, running tacotron.py went smoothly for one of the datasets (40 minutes split across 370 clips), while I still got an error message with the other (29 minutes split across 250 clips). Perhaps dataset size has something to do with these errors. |
Also, I had to change lines 39 and 40 on my copy of train_tacotron.py to point it to my dataset's pickle files. Left unchanged, it kept trying to access the LJSpeech alignment files. |
Good point, I will changes the scripts to take into account the hparams setting. I honestly mostly leave the data naming the same and make copies of the dataset if i train a new model. Could you solve the issue with the smaller dataset? I'm not sure what you mean by adding clips to the dataset, you would have to preprocess the whole dataset again if you add clips... (otherwise its not generating the correct train_dataset.pkl file). |
I've trained a model based on the LJSpeech dataset and found the results quite satisfactory after 25000 steps in ForwardTacotron. Now, I'm currently preparing several other datasets where new models would be based.
The text was updated successfully, but these errors were encountered: