fix a bug, which has the little probability of producing nan in the loss #111

yuanlonghui · 2022-05-09T13:56:02Z

This is the simplest way to prevent calculating log(0). And this is nesseceray when the embedding dim is large. When feature representations have very high dimensions, the maximum inner product is very likely to be the inner product with itself. After subtracting the maximum value, this will result in a lot of negative values in non-diagonal positions. This means that after exp(), it's very likely to be zero anywhere but the diagonal. In this case, since the diagonal position is not considered inside the log(), there is a probability that log(0) will be computed, resulting in nan.

Dara-to-win · 2023-03-13T03:47:37Z

This is the simplest way to prevent calculating log(0). And this is nesseceray when the embedding dim is large. When feature representations have very high dimensions, the maximum inner product is very likely to be the inner product with itself. After subtracting the maximum value, this will result in a lot of negative values in non-diagonal positions. This means that after exp(), it's very likely to be zero anywhere but the diagonal. In this case, since the diagonal position is not considered inside the log(), there is a probability that log(0) will be computed, resulting in nan.

Thank you very much for solving the problem that loss is NaN. Will your loss become higher and higher when you train? I look forward to your reply!

yaoerqin · 2024-03-24T06:43:25Z

This is the simplest way to prevent calculating log(0). And this is nesseceray when the embedding dim is large. When feature representations have very high dimensions, the maximum inner product is very likely to be the inner product with itself. After subtracting the maximum value, this will result in a lot of negative values in non-diagonal positions. This means that after exp(), it's very likely to be zero anywhere but the diagonal. In this case, since the diagonal position is not considered inside the log(), there is a probability that log(0) will be computed, resulting in nan.

Thank you very much for solving the problem that loss is NaN. Will your loss become higher and higher when you train? I look forward to your reply!

Did you solve this?

fix a bug, which has the little probability of producing nan in the loss

4814037

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix a bug, which has the little probability of producing nan in the loss #111

fix a bug, which has the little probability of producing nan in the loss #111

yuanlonghui commented May 9, 2022

Dara-to-win commented Mar 13, 2023

yaoerqin commented Mar 24, 2024

fix a bug, which has the little probability of producing nan in the loss #111

Are you sure you want to change the base?

fix a bug, which has the little probability of producing nan in the loss #111

Conversation

yuanlonghui commented May 9, 2022

Dara-to-win commented Mar 13, 2023

yaoerqin commented Mar 24, 2024