Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unnecessary _prepare_decoder_attention_mask patching #1461

Closed
fxmarty opened this issue Oct 17, 2023 · 0 comments
Closed

Remove unnecessary _prepare_decoder_attention_mask patching #1461

fxmarty opened this issue Oct 17, 2023 · 0 comments
Labels
onnx Related to the ONNX export

Comments

@fxmarty
Copy link
Contributor

fxmarty commented Oct 17, 2023

Feature request

Some patching of transformers _prepare_decoder_attention_mask was introduced in #1257, which can be avoided simply by exporting with a sequence length > 1.

See

class LlamaModelPatcher(CausalAttentionMaskModelPatcher):

Motivation

Code simplification

Your contribution

/

@fxmarty fxmarty added the onnx Related to the ONNX export label Oct 17, 2023
@fxmarty fxmarty closed this as completed Jan 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
onnx Related to the ONNX export
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant