Add `DebiasedMultipleNegativesRankingLoss` to the losses #3148

ilanaliouchouche · 2024-12-30T13:50:41Z

This PR introduces the Debiased Contrastive Loss, from the paper "Debiased Contrastive Learning" (Chuang et al., NeurIPS 2020). The purpose of this loss is to reduce false negative bias, which occurs when negative samples in the dataset are semantically similar to the anchor. Such bias can harm the quality of embeddings and reduce performance in downstream tasks, as shown in the paper's results.

The integration follows the same structure as other losses in the losses package, with full documentation and a citation method to reference the original work. This loss is an improved version of MultipleNegativesRankingLoss with an additional hyper-parameter tau_plus that controls the bias correction. Thus, it's compatible with methods like GenQ (see Query Generation Example).

In this implementation, I focus on the case where $M = 1$, meaning each anchor has one positive sample. This approach can be extended to handle multiple positive samples $M \geq 1$, which could be a direction for future development. (Here, $M$ refers to the number of positive examples associated with each anchor)

TODO: Prepare PR

ilanaliouchouche added 2 commits December 29, 2024 02:03

todo: final formula + doc + add scales(temp)

1ec2e03

Loss Class & Doc Done.

32c41db

TODO: Prepare PR

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `DebiasedMultipleNegativesRankingLoss` to the losses #3148

Add `DebiasedMultipleNegativesRankingLoss` to the losses #3148

ilanaliouchouche commented Dec 30, 2024

Add DebiasedMultipleNegativesRankingLoss to the losses #3148

Are you sure you want to change the base?

Add DebiasedMultipleNegativesRankingLoss to the losses #3148

Conversation

ilanaliouchouche commented Dec 30, 2024

Add `DebiasedMultipleNegativesRankingLoss` to the losses #3148

Add `DebiasedMultipleNegativesRankingLoss` to the losses #3148