Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileNotFoundError when using SentenceTransformerTrainingArguments(load_best_model_at_end=True) and Peft #34747

Open
1 of 4 tasks
GTimothee opened this issue Nov 15, 2024 · 4 comments
Labels

Comments

@GTimothee
Copy link

GTimothee commented Nov 15, 2024

System Info

I used google colab default environment, with last version of transformers and sentence-transformers

Who can help?

@muellerzr

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Here is the example code as a gist: gist

Just open the gist in a colab notebook and run it

Expected behavior

This is a follow up about another bug found in sentence-transformers. The sentence-transformers library just integrated peft using transformers.integrations. The bug is that when using SentenceTransformerTrainingArguments(load_best_model_at_end=True) there is a FileNotFoundError as we try to load a classical checkpoint file (pth) but we saved an adapter instead. When looking into the load_best_model function, it just uses the function from the transformers.trainer.Trainer. So we need to modify the transformers library to solve the problem.

The issue is that there is a function in transformers that checks if the model is a PeftMixedModel or not. If not, it is not considered a peft model and the trainer tries to load the model as usual. The problem is our model is a PeftAdapterMixin so it is not recognized as a peft model.

See also: UKPLab/sentence-transformers#3056

In my opinion, we need to add to the check a 2-step check 1) is it a PeftAdapterMixin and 2) has it adapters loaded? Maybe it is only one part of the solution though, and we need a special loading snippet in the transformers.trainer.Trainer._load_best_model directly.

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@Rocketknight1
Copy link
Member

cc @muellerzr @SunMarc - seems like a bad interaction between PEFT and Trainer's assumptions

@SunMarc
Copy link
Member

SunMarc commented Dec 24, 2024

Thanks for the report @GTimothee, maybe interesting to you too @BenjaminBossan. Would you like to submit a PR to fix it @GTimothee ?

@BenjaminBossan
Copy link
Member

Thanks @GTimothee for the report and @SunMarc for the ping. Indeed, the analysis is correct. For the fix, it would probably be sufficient to adjust _is_peft_model to check additionally check if any(isinstance(module, BaseTunerLayer) for module in model.modules()) (where BaseTunerLayer comes from peft.tuners.tuners_utils). I haven't verified this though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants