-
Notifications
You must be signed in to change notification settings - Fork 27.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: mark model_parallel tests as cuda specific #35269
Merged
+2
−0
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@require_torch_multi_gpu
on the next line was previously made non-CUDA specific by 11c27dd. Right now it sets requirement for having any multi-GPUs on the system.I wonder, maybe it makes sense to rename
@require_torch_gpu
to@require_torch_cuda
to avoid naming collision? I can follow up on that in separate PR.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @ydshieh on this! 🤗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @dvrogozh I am not sure I follow with
was previously made non-CUDA specific by 11c27dd
. Is the commit link is the correct one?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's correct commit and link. Unfortunately it's a huge commit with multiple changes inside and one of them was to make
@require_torch_multi_gpu
non-cuda specific. See a change insrc/transformers/testing_utils.py
, line 743 in old code, line 788 in new code, Commit changedtorch.cuda.device_count()
todevice_count
and the latter is calculated via newly added function (which works for CUDA and XPU at the moment):transformers/src/transformers/testing_utils.py
Line 238 in 18e896b
Currently
@require_torch_multi_gpu
looks like this and is not CUDA specific (after 11c27dd):transformers/src/transformers/testing_utils.py
Lines 769 to 781 in 18e896b
Note that
@require_torch_gpu
is still CUDA-specific:transformers/src/transformers/testing_utils.py
Lines 984 to 986 in 18e896b
The point I am trying to make is that
@require_torch_multi_gpu
and@require_torch_gpu
are named similarly, but diverged (after 11c27dd) in what they actually signify which leads to confusion. My proposal is to rename@require_torch_gpu
to@require_torch_cuda
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a series of work regarding device agnostic, and
require_torch_accelerator
,require_torch_multi_accelerator
etc are added for device agnostic. The ``require_torch_xxx_gpu` should not be modified IMO, but I guess @Titus-von-Koelle wasn't realizing the existing methods #31098 is added at that time.