You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The KL value is abnormally low here. This can be due to a couple reasons:
The learning rate is too high. It should be between 5e-7 and 5e-6. If it's higher than that, the rewards of both the chosen and rejected examples will become very negative (although rewards/chosen > rewards/rejected). Since the KL term is estimated by taking the average reward of randomly chosen input-output pairs (and then clamped to 0), if all the rewards are negative, then the KL estimate will be zero.
Related to the first point, the loss beta is too low. The lower beta is, the lower the learning rate must be to make sure that rewards/chosen is still positive and rewards/negative is negative, which will then allow the unrelated input-output pairs used to estimate the KL term to have weakly positive rewards.
The model doesn't have enough capacity to learn why the chosen examples are good, so it just pushes down the probability of all the rejected examples to compensate. This leads to a collapse in rewards (they all become negative), and the KL estimate becomes zero.
When I train with KTO, the KL value quickly drops to 0, is this normal?
The text was updated successfully, but these errors were encountered: