Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DebiasedMultipleNegativesRankingLoss to the losses #3148

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ilanaliouchouche
Copy link

This PR introduces the Debiased Contrastive Loss, from the paper "Debiased Contrastive Learning" (Chuang et al., NeurIPS 2020). The purpose of this loss is to reduce false negative bias, which occurs when negative samples in the dataset are semantically similar to the anchor. Such bias can harm the quality of embeddings and reduce performance in downstream tasks, as shown in the paper's results.

Image description

The integration follows the same structure as other losses in the losses package, with full documentation and a citation method to reference the original work. This loss is an improved version of MultipleNegativesRankingLoss with an additional hyper-parameter tau_plus that controls the bias correction. Thus, it's compatible with methods like GenQ (see Query Generation Example).

In this implementation, I focus on the case where $M = 1$, meaning each anchor has one positive sample. This approach can be extended to handle multiple positive samples $M \geq 1$, which could be a direction for future development. (Here, $M$ refers to the number of positive examples associated with each anchor)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant