Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model Overfitting #39

Open
Vadim2S opened this issue Jan 16, 2023 · 2 comments
Open

Model Overfitting #39

Vadim2S opened this issue Jan 16, 2023 · 2 comments

Comments

@Vadim2S
Copy link

Vadim2S commented Jan 16, 2023

I am try 4 different datasets. The biggest one contains 4 languages with 20600 pngs with 10 second spectrogramm for each language.

No luck. Train accuracy is 0.97 Validation and Test accuracy is 0.2 - 0.4. What dataset size I am must use?

P.S. I am use you default config. I am changed code (a little) to use Keras2 and Tensorflow 1.14.

@Themba4Sho
Copy link

Hey there. 20600 specs should be enough to get you decent results. My two cents would be to: 1) check the language distribution in your datasets, is there a language that is unreasonably higher than the rest?, 2) Is your test data from the same 20600? If it is not, then it's possible that the test is too different from your training data and you might need to try some augmentation techniques to accommodate the uniqueness of your test set.

@Vadim2S
Copy link
Author

Vadim2S commented Jan 24, 2023

The dataset preparation code of this project guarantied equal amount of specs for each language. I am do not normalize audio beforehand (authors too) however.

I am just run wav_to_spectrogram.py, create_csv.py, train.py with default config.yaml topcoder_crnn_finetune model. Later I am also try inceptionv3_crnn.py model.

I am try:

  1. Voxforge dataset - 5 language with 10020 spec each. 100 pixel per second i.e 5 second specs with default input shape
  2. Mozilla_Common_Voice - 4 language with 16650 spec each. 100 pixel per second i.e 5 second specs
  3. MTEDX dataset - 4 language with 21000 spec each. 50 pixel per second i.e 10 second specs
  4. MTEDX dataset - 3 language with 43000 spec each. 50 pixel per second i.e 10 second specs
  5. My own native dataset - 9 language with 5000 spec each. 100 pixel per second

MTDEX dataset can be found here https://www.openslr.org/100/

Results:

Voxforge dataset - 5 language with 10020 - topcoder_crnn_finetune - Overfitting
loss: 0.4348 - accuracy: 0.9660 - val_loss: 17.8252 - val_accuracy: 0.2000
Epoch 00013: val_accuracy did not improve from 0.28015 Epoch 00013: early stopping

MTEDX dataset - 4 language with 21000 - topcoder_crnn_finetune - Overfitting
loss: 0.3180 - accuracy: 0.9607 - val_loss: 3.1414 - val_accuracy: 0.4399
Epoch 00013: val_accuracy did not improve from 0.67722 Epoch 00013: early stopping

MTEDX dataset - 3 language with 43000 - topcoder_crnn_finetune - Overfitting
loss: 0.3966 - accuracy: 0.9733 - val_loss: 6.0934 - val_accuracy: 0.3294
Epoch 00014: val_accuracy did not improve from 0.33373 Epoch 00014: early stopping

MTEDX dataset - 3 language with 43000 - inceptionv3_crnn - Just very poor result
loss: 0.9407 - accuracy: 0.5551 - val_loss: 0.9851 - val_accuracy: 0.5081
Epoch 00031: val_accuracy did not improve from 0.53137 Epoch 00031: early stopping

Here interesting experiment - I am comment layer.trainable=False line in model. Result is much better but still bad

MTEDX dataset - 3 language with 43000 - topcoder_crnn_finetune - Overfitting
loss: 0.2834 - accuracy: 0.9531 - val_loss: 1.4304 - val_accuracy: 0.6554
Epoch 00016: val_accuracy did not improve from 0.82805 Epoch 00016: early stopping

No luck.

P.S. Model conversion code sample is below and this is very simple:

Original:

    model.add(Convolution2D(16, 7, 7, W_regularizer=l2(weight_decay), activation="relu", input_shape=input_shape))

My code for Keras2 and TF1.14

    model.add(Conv2D(16, (7, 7), activation="relu", input_shape=input_shape, kernel_regularizer=l2(weight_decay)))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants