Model Overfitting #39

Vadim2S · 2023-01-16T07:57:46Z

I am try 4 different datasets. The biggest one contains 4 languages with 20600 pngs with 10 second spectrogramm for each language.

No luck. Train accuracy is 0.97 Validation and Test accuracy is 0.2 - 0.4. What dataset size I am must use?

P.S. I am use you default config. I am changed code (a little) to use Keras2 and Tensorflow 1.14.

Themba4Sho · 2023-01-24T09:12:09Z

Hey there. 20600 specs should be enough to get you decent results. My two cents would be to: 1) check the language distribution in your datasets, is there a language that is unreasonably higher than the rest?, 2) Is your test data from the same 20600? If it is not, then it's possible that the test is too different from your training data and you might need to try some augmentation techniques to accommodate the uniqueness of your test set.

Vadim2S · 2023-01-24T10:26:26Z

The dataset preparation code of this project guarantied equal amount of specs for each language. I am do not normalize audio beforehand (authors too) however.

I am just run wav_to_spectrogram.py, create_csv.py, train.py with default config.yaml topcoder_crnn_finetune model. Later I am also try inceptionv3_crnn.py model.

I am try:

Voxforge dataset - 5 language with 10020 spec each. 100 pixel per second i.e 5 second specs with default input shape
Mozilla_Common_Voice - 4 language with 16650 spec each. 100 pixel per second i.e 5 second specs
MTEDX dataset - 4 language with 21000 spec each. 50 pixel per second i.e 10 second specs
MTEDX dataset - 3 language with 43000 spec each. 50 pixel per second i.e 10 second specs
My own native dataset - 9 language with 5000 spec each. 100 pixel per second

MTDEX dataset can be found here https://www.openslr.org/100/

Results:

Voxforge dataset - 5 language with 10020 - topcoder_crnn_finetune - Overfitting
loss: 0.4348 - accuracy: 0.9660 - val_loss: 17.8252 - val_accuracy: 0.2000
Epoch 00013: val_accuracy did not improve from 0.28015 Epoch 00013: early stopping

MTEDX dataset - 4 language with 21000 - topcoder_crnn_finetune - Overfitting
loss: 0.3180 - accuracy: 0.9607 - val_loss: 3.1414 - val_accuracy: 0.4399
Epoch 00013: val_accuracy did not improve from 0.67722 Epoch 00013: early stopping

MTEDX dataset - 3 language with 43000 - topcoder_crnn_finetune - Overfitting
loss: 0.3966 - accuracy: 0.9733 - val_loss: 6.0934 - val_accuracy: 0.3294
Epoch 00014: val_accuracy did not improve from 0.33373 Epoch 00014: early stopping

MTEDX dataset - 3 language with 43000 - inceptionv3_crnn - Just very poor result
loss: 0.9407 - accuracy: 0.5551 - val_loss: 0.9851 - val_accuracy: 0.5081
Epoch 00031: val_accuracy did not improve from 0.53137 Epoch 00031: early stopping

Here interesting experiment - I am comment layer.trainable=False line in model. Result is much better but still bad

MTEDX dataset - 3 language with 43000 - topcoder_crnn_finetune - Overfitting
loss: 0.2834 - accuracy: 0.9531 - val_loss: 1.4304 - val_accuracy: 0.6554
Epoch 00016: val_accuracy did not improve from 0.82805 Epoch 00016: early stopping

No luck.

P.S. Model conversion code sample is below and this is very simple:

Original:

    model.add(Convolution2D(16, 7, 7, W_regularizer=l2(weight_decay), activation="relu", input_shape=input_shape))

My code for Keras2 and TF1.14

    model.add(Conv2D(16, (7, 7), activation="relu", input_shape=input_shape, kernel_regularizer=l2(weight_decay)))

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model Overfitting #39

Model Overfitting #39

Vadim2S commented Jan 16, 2023

Themba4Sho commented Jan 24, 2023

Vadim2S commented Jan 24, 2023

Model Overfitting #39

Model Overfitting #39

Comments

Vadim2S commented Jan 16, 2023

Themba4Sho commented Jan 24, 2023

Vadim2S commented Jan 24, 2023