Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XLMRoberta with Flash Attention 2 #27957

Open
2 of 4 tasks
IvanPy96 opened this issue Dec 11, 2023 · 5 comments
Open
2 of 4 tasks

XLMRoberta with Flash Attention 2 #27957

IvanPy96 opened this issue Dec 11, 2023 · 5 comments
Labels
Feature request Request for a new feature Good Second Issue Issues that are more difficult to do than "Good First" issues - give it a try if you want!

Comments

@IvanPy96
Copy link

System Info

  • transformers version: 4.36.0
  • Platform: Linux-4.19.0-22-amd64-x86_64-with-glibc2.31
  • Python version: 3.10.13
  • Huggingface_hub version: 0.19.4
  • Safetensors version: 0.4.0
  • Accelerate version: 0.24.1
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.0.1+cu117 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

Who can help?

@ArthurZucker @younesbelkada

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained("my_model/", attn_implementation="flash_attention_2")

Expected behavior

Ability to use flash attention 2 for inference. Is it possible to add support of flash attention 2 for XLMRoberta model?

@ArthurZucker ArthurZucker added the Feature request Request for a new feature label Dec 12, 2023
@ArthurZucker
Copy link
Collaborator

Thanks for opening, will mark as a good second issue 🤗

@ArthurZucker ArthurZucker added the Good Second Issue Issues that are more difficult to do than "Good First" issues - give it a try if you want! label Dec 12, 2023
@mohammedElfatihSalah
Copy link

Hi @IvanPy96 & @ArthurZucker I want to work on this issue. Could you please assign it to me?

@ArthurZucker
Copy link
Collaborator

Hey, we don't assign issue, feel free to open a PR and link it to this issue 😉

@aikangjun
Copy link

aikangjun commented Aug 30, 2024

Hi, it seems that this issue has not been resolved ,XLMRoberta still cannot use FlashAttention 2.
image

@ArthurZucker
Copy link
Collaborator

Hey! Yes as both PR were closed: see the last comment

@aikangjun This PR wasn't merged - it closed because of inactivity it seems. We've recently merged in other PRs to add SDPA to roberta based models though #30510 which adds it to this model. This isn't part of 4.42 but will be part of the next release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature request Request for a new feature Good Second Issue Issues that are more difficult to do than "Good First" issues - give it a try if you want!
Projects
None yet
4 participants