Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: 'Gemma2Attention' object has no attribute '_flash_attn_uses_top_left_mask' #35285

Closed

Conversation

jp1924
Copy link
Contributor

@jp1924 jp1924 commented Dec 16, 2024

What does this PR do?

A PR to fix the 'no attribute '_flash_attn_uses_top_left_mask'' error occurring in each model with the flash_attention module, including gemma2.

Reproduction Code

from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer


def main() -> None:
    repo_id = "google/gemma-2-2b-it"
    config = AutoConfig.from_pretrained(repo_id, _attn_implementation="flash_attention_2")
    model = AutoModelForCausalLM.from_pretrained(repo_id, config=config, device_map="cpu")
    tokenizer = AutoTokenizer.from_pretrained(repo_id)

    text = tokenizer.apply_chat_template([{"role": "user", "content": "Hello!"}], tokenize=False)
    input_param = tokenizer(text, return_tensors="pt", return_attention_mask=True)
    input_param["labels"] = input_param["input_ids"].clone()
    output = model(**input_param)


if "__main__" in __name__:
    main()

pip install git+https://github.com/huggingface/transformers.git@5615a393691c81e00251e420c73e4d04c6fe22e5

Env

- `transformers` version: 4.48.0.dev0
- Platform: Linux-5.15.0-124-generic-x86_64-with-glibc2.35
- Python version: 3.10.12
- Huggingface_hub version: 0.26.2
- Safetensors version: 0.4.5
- Accelerate version: 1.1.1

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

@ArthurZucker

@jp1924
Copy link
Contributor Author

jp1924 commented Dec 16, 2024

@ArthurZucker
the attention module is being included in the config. Is this working as you intended?

@jp1924
Copy link
Contributor Author

jp1924 commented Dec 18, 2024

This pull request is being closed because the issue has been resolved in pull request #35235.

@jp1924 jp1924 closed this Dec 18, 2024
@jp1924 jp1924 deleted the fix_config_flash-attn_attr_error branch December 18, 2024 23:59
@ArthurZucker
Copy link
Collaborator

Thanks ! And sorry for the delay! 🤗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants