-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fall back to slow image processor in ImageProcessingAuto when no fast processor available #34785
Conversation
d80d899
to
0a130f1
Compare
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@@ -246,6 +246,7 @@ def preprocess( | |||
image_mean = image_mean if image_mean is not None else self.image_mean | |||
image_std = image_std if image_std is not None else self.image_std | |||
size = size if size is not None else self.size | |||
return_tensors = "pt" if return_tensors is None else return_tensors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise, the fast vit image processor will crash in the default behavior (when return_tensors
is not specified). This is now a bigger problem with fast image processors used by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this kind of breaking change?
(I am also curious what if a user is using TF/Flax model while their environment has torch/torchvision installed. Is the fast image processor will be used by default and will return torch tensor by default ..?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see, yes this might be a problem. In general, I'm not too sure why it was decided to constrain fast image processors to output only torch tensors. Would be glad to know if there was a reason for that, otherwise it might be something we would want to reconsider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH I am not even sure TF is used for image models! Fine by me!
Hi @yonigozlan! Thanks for opening a PR! Please check this comment regarding enabling |
Oh, I hadn't seen this, thanks for pointing it out @qubvel! I guess it comes down to prioritizing backward compatibility or inference speed. Maybe we could add a
This way, users are aware of the potential difference but also understand the benefit, and they can still choose to use the slow processor if needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Have some comments as a first review.
@@ -246,6 +246,7 @@ def preprocess( | |||
image_mean = image_mean if image_mean is not None else self.image_mean | |||
image_std = image_std if image_std is not None else self.image_std | |||
size = size if size is not None else self.size | |||
return_tensors = "pt" if return_tensors is None else return_tensors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this kind of breaking change?
(I am also curious what if a user is using TF/Flax model while their environment has torch/torchvision installed. Is the fast image processor will be used by default and will return torch tensor by default ..?)
@@ -863,6 +863,7 @@ def preprocess( | |||
input_data_format = infer_channel_dimension_format(images[0]) | |||
if input_data_format == ChannelDimension.LAST: | |||
images = [image.permute(2, 0, 1).contiguous() for image in images] | |||
input_data_format = ChannelDimension.FIRST |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is a bug and here you fix it right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it was, sorry for the lack of explanation. It's actually now fixed by another PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fast image processing is indeed a compelling feature that could benefit many users. It looks similar to setting the default to SDPA
over eager
attention. While I haven’t noticed complaints about making SDPA the default, there were issues with CI breaking in different projects after SDPA was added to popular models like CLIP and Siglip.
For users experiencing slow processing speeds, exploring the option to set use_fast=True
might be worthwhile. However, for those not facing performance challenges, it may not be necessary.
I don’t have strong objections to this feature, but I suggest the following approach to implement it effectively:
-
Add tests for thorough comparison between Slow and Fast image processors to ensure nothing breaks, especially for custom processors with non-default parameters. Tests should cover all Slow and Fast image processors (maybe a common tests?).
-
Inform users well ahead of time about the upcoming behavior change. For instance, announce that Fast image processors will become the default in 3–5 releases. Users who wish to retain the current behavior should explicitly set
use_fast=False
in their code.
This gradual rollout can minimize potential issues while allowing users time to adapt. What do you think?
@qubvel Agreed it might be better to do a progressive rollout and add more base tests in the meantime. I can change this PR to do that and also keep some of the refactoring of |
Considering the usage of TF/Flax, I am not sure if it is worth the effort (if it will require more than expected effort). You can talk to core maintainers about this part. |
@yonigozlan sounds great! |
I think we should start rolling this out (the warnings) and reduce to 2 releases for example (2 months) giving us time to tests! Absolutely no need to work on TF / Jax versions for now either! ➕ |
c1283fc
to
95a508b
Compare
@ArthurZucker Changed the scope of this PR to fix the error when use_fast is set to True and no fast image processor is available. Also added warnings to start rolling out fast image processors by default. |
de5613b
to
f107a66
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, yep let's make sure 4.48 has fast by default!
if use_fast: | ||
if not image_processor_type.endswith("Fast"): | ||
image_processor_type += "Fast" | ||
image_processor_class = get_image_processor_class_from_name(image_processor_type) | ||
if image_processor_class is None: | ||
logger.warning_once( | ||
"`use_fast` is set to `True` but the image processor class does not have a fast version. " | ||
" Falling back to the slow version." | ||
) | ||
image_processor_class = get_image_processor_class_from_name(image_processor_type[:-4]) | ||
else: | ||
image_processor_type = ( | ||
image_processor_type[:-4] if image_processor_type.endswith("Fast") else image_processor_type | ||
) | ||
image_processor_class = get_image_processor_class_from_name(image_processor_type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be simplified a bit to check first if image_processor_type + "Fast"
is in the mapping, if yes we take, if no we don't. only need to call get_image_processor_class_from_name
once
@@ -246,6 +246,7 @@ def preprocess( | |||
image_mean = image_mean if image_mean is not None else self.image_mean | |||
image_std = image_std if image_std is not None else self.image_std | |||
size = size if size is not None else self.size | |||
return_tensors = "pt" if return_tensors is None else return_tensors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH I am not even sure TF is used for image models! Fine by me!
95d039a
to
7c066f2
Compare
What does this PR do?
Refactor parts of image_processing_auto to fall back on slow processor when use_fast is set to True and no fast processor is available.
Before, this would throw an error:
Now the following warning is displayed
Also add warnings to start rolling out fast image processor by default (goal is v4.48). If use_fast is not set, and the checkpoint was saved with a slow processor, display:
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.