Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sdpa for Beit #34941

Merged
merged 10 commits into from
Dec 17, 2024
Merged

Add sdpa for Beit #34941

merged 10 commits into from
Dec 17, 2024

Conversation

OmarManzoor
Copy link
Contributor

What does this PR do?

Related to #28005

  • Adds sdpa for Beit model
  • Also extends the support of sdpa to Data2VecVision model to ensure consistency

Who can review?

@qubvel

Copy link
Member

@qubvel qubvel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad to see you in another sdpa-PR @OmarManzoor! Thanks for working on the implementation 👍 Overall looking great, just a few notes:

docs/source/en/model_doc/beit.md Show resolved Hide resolved
src/transformers/models/beit/modeling_beit.py Show resolved Hide resolved
tests/models/beit/test_modeling_beit.py Show resolved Hide resolved
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@qubvel qubvel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, I'm getting the following values for benchmark

BeitForImageClassification

Image batch size Eager (s/iter) Eager CI, % SDPA (s/iter) SDPA CI, % SDPA speedup
1 0.013 ±0.2% 0.011 ±0.5% 1.122
4 0.013 ±0.2% 0.011 ±0.1% 1.122
16 0.026 ±0.1% 0.021 ±0.5% 1.232
32 0.048 ±0.8% 0.039 ±0.3% 1.252

with the following env:

Python version: 3.10.12 (main, Nov  6 2024, 20:22:13) [GCC 11.4.0]
Transformers version: 4.47.0.dev0
Torch version: 2.5.0+cu118
GPU: NVIDIA A10G

Here is the code, can you run it with your env to check? Also my code might be not the best way to benchmark, would appreciate if you share your code!

@OmarManzoor
Copy link
Contributor Author

Here is the code, can you run it with your env to > > check? Also my code might be not the best way to benchmark, would appreciate if you share your code!

Maybe your script might be better than what I am using but nevertheless here is the script I used for inference.
inference_image_benchmark
I basically just modified the script that we have been using for the text models.

@OmarManzoor
Copy link
Contributor Author

Here are the results using your script

BeitForImageClassification

Image batch size Eager (s/iter) Eager CI, % SDPA (s/iter) SDPA CI, % SDPA speedup
1 0.012 ±0.5% 0.011 ±0.3% 1.135
4 0.013 ±0.2% 0.011 ±0.2% 1.181
16 0.045 ±0.1% 0.035 ±0.1% 1.3
32 0.088 ±0.1% 0.067 ±0.1% 1.322

Environment:

Python version: 3.10.14 (main, Jul 23 2024, 15:53:02) [GCC 9.4.0]
Transformers version: 4.47.0.dev0
Torch version: 2.5.1+cu124
GPU: NVIDIA GeForce RTX 2060 SUPER

@qubvel
Copy link
Member

qubvel commented Nov 27, 2024

Ok, looks good, can you update documentation benchmarks then? Maybe it's worth adding memory stats to my script and rerunning it

@OmarManzoor
Copy link
Contributor Author

OmarManzoor commented Nov 27, 2024

Ok, looks good, can you update documentation benchmarks then? Maybe it's worth adding memory stats to my script and rerunning it

Okay I think we can use your inference benchmarks but for the training part I think we can keep the current benchmarks? Also did you have a look at the script I shared, is there something that is not quite correct in that?

@OmarManzoor
Copy link
Contributor Author

@qubvel Do the slow tests still need to be run?

Copy link
Member

@qubvel qubvel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the ping! Looking great to me, just one comment

src/transformers/models/beit/modeling_beit.py Show resolved Hide resolved
@qubvel qubvel requested a review from ArthurZucker December 2, 2024 11:19
@qubvel
Copy link
Member

qubvel commented Dec 2, 2024

@ArthurZucker please review when you have bandwidth!

tests/models/beit/test_modeling_beit.py Outdated Show resolved Hide resolved
tests/models/beit/test_modeling_beit.py Outdated Show resolved Hide resolved
tests/models/data2vec/test_modeling_data2vec_vision.py Outdated Show resolved Hide resolved
tests/models/data2vec/test_modeling_data2vec_vision.py Outdated Show resolved Hide resolved
Copy link
Contributor

@dvrogozh dvrogozh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OmarManzoor : thank you for rebase and adding thresholds path for xpu. I verified it on my side against upstream pytorch xpu - both added tests work for me.

@OmarManzoor
Copy link
Contributor Author

@dvrogozh Thank you for verifying

@OmarManzoor
Copy link
Contributor Author

@ArthurZucker Can this be merged?

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey all! Sorry for the delay, we are in the middle of a huge refactoring in #35235, which is why I wanted to wait a bit, but good work should be rewarded, so let's merge this! Thanks for being patient 🤗

@ArthurZucker ArthurZucker merged commit 747f361 into huggingface:main Dec 17, 2024
13 checks passed
@OmarManzoor OmarManzoor deleted the beit_sdpa branch December 17, 2024 13:47
@OmarManzoor
Copy link
Contributor Author

Hey all! Sorry for the delay, we are in the middle of a huge refactoring in #35235, which is why I wanted to wait a bit, but good work should be rewarded, so let's merge this! Thanks for being patient 🤗

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants