Using Matryoshka loss with Cached losses #3059

Marcel256 · 2024-11-15T07:57:00Z

I want to use Matryoshka Loss along with Cached Losses like CachedMultipleNegativesRankingLoss and CachedGISTEmbedLoss. However, the current implementation doesn't allow for this combination. I’ve reviewed the existing code and I’m willing to also contribute this feature. Are there any plans already on how this feature could be implemented?

One simple idea is to create subclasses of the existing losses and add the Matryoshka logic within the calculate_loss functions. This way, we can keep the current losses unchanged, though it would lead to some code duplication. Other approaches would require more refactoring.

If anyone has ideas or suggestions, I'd love to hear them!

The text was updated successfully, but these errors were encountered:

tomaarsen · 2024-11-15T09:15:23Z

Hello!

Yes, I agree that this would be quite valuable. I use MatryoshkaLoss whenever possible, and I don't see myself often training with non-Cached MNRL anymore. So, the incompatibility is bothersome.
Ideally, I'd like to keep all changes within the MatryoshkaLoss to keep things more modular and separate. I think it might be feasible - I just haven't looked into it too much. I know there's some shape issue, presumably because the MatryoshkaLoss does some magic to reduce the shapes. I suspect that it's possible to modify the behaviour of MatryoshkaLoss if we have a Cached loss, for example to always return the full embedding if the requires_grad=False, but this might not work for evaluation.

Either way, I'm certainly open to PRs here.

Tom Aarsen

GTimothee · 2024-11-15T13:40:47Z

I think I was able to reproduce the error, was it something like RuntimeError: inconsistent tensor size, expected tensor [768] and src [128] to have the same number of elements, but got 768 and 128 elements respectively ?

Edit: Confirmed, I did not see the UserWarning: MatryoshkaLoss is not compatible with CachedMultipleNegativesRankingLoss. at first

tomaarsen · 2024-11-15T13:49:58Z

That was indeed the error. I added the warning after I noticed that MatryoshkaLoss was not compatible with the Cached... losses.

Squishedmac · 2024-11-17T10:43:13Z

Hi how can i reproduce this issue? i'd like to try and see if i can fix it :) thanks in advance

GTimothee · 2024-11-17T16:26:07Z

Here is a fix proposal @tomaarsen : #3065

Tell me what you think about it and if you like it I can finish it. I tested it with cachedMNR and it seems to work 👍

@Squishedmac just setup a training script with cachedMNR and matryoshkaloss and it should break

tomaarsen · 2024-11-18T12:46:21Z

I think you indeed found the cause with your PR - the backwards call doesn't include the embedding reduction. Hmm, I wonder if there's a nice solution that stays within the MatryoshkaLoss fully, but I'm not very confident right now.

Tom Aarsen

Marcel256 · 2024-11-18T14:17:13Z

I created a draft PR for another potential solution #3068
The main idea is to reuse the embeddings computed in the first step of the cached losses with a seperate decorator which handles the trucation of the embeddings.

tomaarsen · 2024-11-18T18:17:27Z

Solid work, experimenting with that PR now to see what the performance is like @Marcel256

tomaarsen added the good second issue Looking for help; but may be hard to implement label Nov 15, 2024

This was linked to pull requests Nov 19, 2024

Fix matryoshka cached mnr #3065

Closed

support cached losses in combination with matryoshka loss #3068

Merged

tomaarsen closed this as completed in #3068 Nov 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using Matryoshka loss with Cached losses #3059

Using Matryoshka loss with Cached losses #3059

Marcel256 commented Nov 15, 2024

tomaarsen commented Nov 15, 2024

GTimothee commented Nov 15, 2024 •

edited

Loading

tomaarsen commented Nov 15, 2024

Squishedmac commented Nov 17, 2024

GTimothee commented Nov 17, 2024

tomaarsen commented Nov 18, 2024

Marcel256 commented Nov 18, 2024

tomaarsen commented Nov 18, 2024

Using Matryoshka loss with Cached losses #3059

Using Matryoshka loss with Cached losses #3059

Comments

Marcel256 commented Nov 15, 2024

tomaarsen commented Nov 15, 2024

GTimothee commented Nov 15, 2024 • edited Loading

tomaarsen commented Nov 15, 2024

Squishedmac commented Nov 17, 2024

GTimothee commented Nov 17, 2024

tomaarsen commented Nov 18, 2024

Marcel256 commented Nov 18, 2024

tomaarsen commented Nov 18, 2024

GTimothee commented Nov 15, 2024 •

edited

Loading