Regd Forcing Encoder Attention Alignments #96

bkumardevan07 · 2021-04-06T17:39:17Z

Hi I observed that you are forcing encoder attention too to be diagonal for some steps, and I found that after completion of training, the alignment remains diagonal. My question is then why we need more encoder layers if all it has to remain diagonal?? Did you see any issues when not forcing encoder attention diagonal? Any other observations?

Also often I have seen in some papers that their mel outputs are well predicted towards higher frequency region of mel spectrogram, but in all my trainings the results come a little blurry around the top regions of mel-spectrogram. Does this has to do anything with convergence? Any ideas what might be happening wrong?

cfrancesco · 2021-04-13T09:32:45Z

Hi,
in my experiments the encoder alignments are rather optional, that's why I set it to a lower number of steps than the decoder. You probably can safely set it to 0. I didn't experiment extensively, but I didn't notice a drawback. Also, without forcing this diagonality almost all the encoder heads in the aligner tend to become diagonal eventually (typically 0-1 per layer are scattered).
With fewer layers I did experience reduction in quality on the predicted mels.

In case you perform a more complete analysis it would be great to hear the results!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regd Forcing Encoder Attention Alignments #96

Regd Forcing Encoder Attention Alignments #96

bkumardevan07 commented Apr 6, 2021 •

edited

Loading

cfrancesco commented Apr 13, 2021

Regd Forcing Encoder Attention Alignments #96

Regd Forcing Encoder Attention Alignments #96

Comments

bkumardevan07 commented Apr 6, 2021 • edited Loading

cfrancesco commented Apr 13, 2021

bkumardevan07 commented Apr 6, 2021 •

edited

Loading