Remove unnecessary `_prepare_decoder_attention_mask` patching #1461

fxmarty · 2023-10-17T16:20:55Z

Feature request

Some patching of transformers _prepare_decoder_attention_mask was introduced in #1257, which can be avoided simply by exporting with a sequence length > 1.

See

optimum/optimum/exporters/onnx/model_patcher.py

Line 405 in e7bd60d

class LlamaModelPatcher(CausalAttentionMaskModelPatcher):

Motivation

Code simplification

Your contribution

/

The text was updated successfully, but these errors were encountered:

fxmarty added the onnx Related to the ONNX export label Oct 17, 2023

baskrahmer mentioned this issue Oct 21, 2023

Remove attn mask patching #1473

Closed

fxmarty closed this as completed Jan 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove unnecessary `_prepare_decoder_attention_mask` patching #1461

Remove unnecessary `_prepare_decoder_attention_mask` patching #1461

fxmarty commented Oct 17, 2023 •

edited

Loading

Remove unnecessary _prepare_decoder_attention_mask patching #1461

Remove unnecessary _prepare_decoder_attention_mask patching #1461

Comments

fxmarty commented Oct 17, 2023 • edited Loading

Feature request

Motivation

Your contribution

Remove unnecessary `_prepare_decoder_attention_mask` patching #1461

Remove unnecessary `_prepare_decoder_attention_mask` patching #1461

fxmarty commented Oct 17, 2023 •

edited

Loading