-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Setting compute_metrics in Trainer with Idefics2ForConditionalGeneration leads to AttributeError: 'DynamicCache' object has no attribute 'detach' #30631
Comments
I had the same error and fixed it by using |
That fixes this issue as the past_key_values are now full tensors.
|
Yes this is due to batches having different lengths of It can be fixed by either padding all examples to the same length (i.e. using |
I don't have a better fix! |
I think the cache problem should be fixed by converting These changes are partially related to issue of making language models "compile" compatible, and should be available soon 🤗 |
Thanks for the explanation @zucchini-nlp! Does this mean that this fix won't be needed soon, or that it enables something which isn't available yet but will be soon? |
@zucchini-nlp OK. The main thing to know is what, if anything, should be updated in idefics2. Is what @gante is doing addressing this? |
@amyeroberts I am not sure what should be the correct format of cache objects we return for language models since now we do not have consistency, so I wanted @gante to look at it. There are two options for this:
Also I believe we are going to get rid of the tuple type cache sometime in the future, so cache+Trainer is something to have in mind for then |
@zucchini-nlp OK, great, thanks for explaining. Let's leave as-is and then once the cache format is standardized we can propogate this to idefics2 + other models. |
Hi @EloiEynard I just uploaded an example notebook for fine-tuning Idefics2 on an image -> JSON dataset here: https://github.com/NielsRogge/Transformers-Tutorials/blob/master/Idefics2/Fine_tune_Idefics2_for_JSON_extraction_use_cases_(PyTorch_Lightning).ipynb |
Thanks @NielsRogge, I got it all figured out with the Trainer and am currently finetuning with my custom eval. Wish I knew about lightning earlier though, seems more explicit. By the way, if you don't mind me asking, I've noticed in your notebooks you use |
I had the same question, turns out both are equivalent. The I'm currently looking into creating a similar notebook that leverages the Trainer API with |
I see, thanks for the details ! |
System Info
transformers
version: 4.41.0.dev0Who can help?
Not sure if this is an issue with the Trainer or the model.
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
The following code is from the Idefics2 fine-tuning example colab with the addition of the compute_metrics in the Trainer.
Here is the exception :
Seems to happend when the model's output's past_key_values are an empty DynamicCache.
Expected behavior
Should properly reach the custom_metrics and terminate cleanly.
The text was updated successfully, but these errors were encountered: