Fix Seq2seqTrainer decoder attention mask #26841

Rocketknight1 · 2023-10-16T16:36:07Z

The Seq2SeqTrainer drops decoder_input_ids during the generation step for metrics that expect text generation (like rouge) when labels is present. However, it doesn't drop decoder_attention_mask when it does this, which means that in some cases, we pass decoder_attention_mask with no decoder_input_ids, resulting in the model getting very confused and throwing a shape error.

This PR fixes the issue.

Fixes #24567

…mask

Rocketknight1 · 2023-10-16T16:38:25Z

cc @gante and @ydshieh because I see you in the git blame near here - let me know if this is okay, or if I'm breaking anything with this fix!

HuggingFaceDocBuilderDev · 2023-10-16T16:53:11Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

ArthurZucker

Ohh super nice thanks for fixing it!

Don't drop decoder_input_ids without also dropping decoder_attention_mask

Don't drop decoder_input_ids without also dropping decoder_attention_…

2d71e3a

…mask

Rocketknight1 requested a review from ArthurZucker October 16, 2023 16:36

ArthurZucker approved these changes Oct 17, 2023

View reviewed changes

Rocketknight1 merged commit 34678db into main Oct 18, 2023
3 checks passed

Rocketknight1 deleted the seq2seq_trainer_attention_mask_fix branch October 18, 2023 12:28

EduardoPach pushed a commit to EduardoPach/transformers that referenced this pull request Nov 19, 2023

Fix Seq2seqTrainer decoder attention mask (huggingface#26841)

58f99cf

Don't drop decoder_input_ids without also dropping decoder_attention_mask

Sai-Suraj-27 mentioned this pull request Aug 19, 2024

fix: Fixed CodeGenTokenizationTest::test_truncation failing test #32850

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Seq2seqTrainer decoder attention mask #26841

Fix Seq2seqTrainer decoder attention mask #26841

Rocketknight1 commented Oct 16, 2023

Rocketknight1 commented Oct 16, 2023

HuggingFaceDocBuilderDev commented Oct 16, 2023

ArthurZucker left a comment

Fix Seq2seqTrainer decoder attention mask #26841

Fix Seq2seqTrainer decoder attention mask #26841

Conversation

Rocketknight1 commented Oct 16, 2023

Rocketknight1 commented Oct 16, 2023

HuggingFaceDocBuilderDev commented Oct 16, 2023

ArthurZucker left a comment

Choose a reason for hiding this comment