Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast image processor #28847

Merged
merged 40 commits into from
Jun 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
68a1e75
Draft fast image processors
amyeroberts Jan 9, 2024
dbf9959
Draft working fast version
amyeroberts Feb 2, 2024
3632cf7
py3.8 compatible cache
amyeroberts Feb 13, 2024
49196c8
Enable loading fast image processors through auto
amyeroberts May 10, 2024
6fc2901
Tidy up; rescale behaviour based on input type
amyeroberts May 10, 2024
834ae6a
Enable tests for fast image processors
amyeroberts May 10, 2024
6d5c328
Smarter rescaling
amyeroberts May 13, 2024
eb701c1
Don't default to Fast
amyeroberts May 16, 2024
415be88
Safer imports
amyeroberts May 16, 2024
fb89515
Add necessary Pillow requirement
amyeroberts May 16, 2024
d2eb99f
Woops
amyeroberts May 16, 2024
fc1530e
Add AutoImageProcessor test
amyeroberts May 16, 2024
a3f6d02
Fix up
amyeroberts May 16, 2024
e0bd18d
Fix test for imagegpt
amyeroberts May 16, 2024
8c4761a
Fix test
amyeroberts May 17, 2024
687da88
Review comments
amyeroberts May 22, 2024
fc1e121
Add warning for TF and JAX input types
amyeroberts May 22, 2024
1077938
Rearrange
amyeroberts May 22, 2024
5cb11df
Return transforms
amyeroberts May 22, 2024
fff70c3
NumpyToTensor transformation
amyeroberts May 22, 2024
8b09622
Rebase - include changes from upstream in ImageProcessingMixin
amyeroberts May 22, 2024
8d82609
Safe typing
amyeroberts May 22, 2024
849e27b
Fix up
amyeroberts May 22, 2024
fdd4e5d
convert mean/std to tesnor to rescale
amyeroberts May 22, 2024
0ad7e71
Don't store transforms in state
amyeroberts May 24, 2024
1b5885b
Fix up
amyeroberts May 24, 2024
e29150c
Update src/transformers/image_processing_utils_fast.py
amyeroberts Jun 5, 2024
a1f718b
Update src/transformers/models/auto/image_processing_auto.py
amyeroberts Jun 5, 2024
af52ee2
Update src/transformers/models/auto/image_processing_auto.py
amyeroberts Jun 5, 2024
34b8859
Update src/transformers/models/auto/image_processing_auto.py
amyeroberts Jun 5, 2024
5e7a30d
Warn if fast image processor available
amyeroberts Jun 5, 2024
2d75607
Update src/transformers/models/vit/image_processing_vit_fast.py
amyeroberts Jun 5, 2024
a43cabc
Transpose incoming numpy images to be in CHW format
amyeroberts Jun 5, 2024
6acf27f
Update mapping names based on packages, auto set fast to None
amyeroberts Jun 5, 2024
a38d3ee
Fix up
amyeroberts Jun 5, 2024
942286f
Fix
amyeroberts Jun 5, 2024
954ee20
Add AutoImageProcessor.from_pretrained(checkpoint, use_fast=True) test
amyeroberts Jun 5, 2024
ee06a6a
Update src/transformers/models/vit/image_processing_vit_fast.py
amyeroberts Jun 6, 2024
1d1d416
Add equivalence and speed tests
amyeroberts Jun 7, 2024
d598b5a
Fix up
amyeroberts Jun 7, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/source/en/main_classes/image_processor.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,3 +32,8 @@ An image processor is in charge of preparing input features for vision models an
## BaseImageProcessor

[[autodoc]] image_processing_utils.BaseImageProcessor


## BaseImageProcessorFast

[[autodoc]] image_processing_utils_fast.BaseImageProcessorFast
7 changes: 6 additions & 1 deletion docs/source/en/model_doc/vit.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ Following the original Vision Transformer, some follow-up works have been made:
This model was contributed by [nielsr](https://huggingface.co/nielsr). The original code (written in JAX) can be
found [here](https://github.com/google-research/vision_transformer).

Note that we converted the weights from Ross Wightman's [timm library](https://github.com/rwightman/pytorch-image-models),
Note that we converted the weights from Ross Wightman's [timm library](https://github.com/rwightman/pytorch-image-models),
who already converted the weights from JAX to PyTorch. Credits go to him!

## Usage tips
Expand Down Expand Up @@ -158,6 +158,11 @@ A list of official Hugging Face and community (indicated by 🌎) resources to h
[[autodoc]] ViTImageProcessor
- preprocess

## ViTImageProcessorFast

[[autodoc]] ViTImageProcessorFast
- preprocess

<frameworkcontent>
<pt>

Expand Down
1 change: 1 addition & 0 deletions examples/pytorch/_tests_requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -29,3 +29,4 @@ timm
albumentations >= 1.4.5
torchmetrics
pycocotools
Pillow>=10.0.1,<=15.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, why 15.0 here? :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a security thing :) It matches the pin we have in setup.py which was set in #27409

27 changes: 25 additions & 2 deletions src/transformers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -1104,7 +1104,8 @@
name for name in dir(dummy_vision_objects) if not name.startswith("_")
]
else:
_import_structure["image_processing_utils"] = ["ImageProcessingMixin"]
_import_structure["image_processing_base"] = ["ImageProcessingMixin"]
_import_structure["image_processing_utils"] = ["BaseImageProcessor"]
_import_structure["image_utils"] = ["ImageFeatureExtractionMixin"]
_import_structure["models.beit"].extend(["BeitFeatureExtractor", "BeitImageProcessor"])
_import_structure["models.bit"].extend(["BitImageProcessor"])
Expand Down Expand Up @@ -1167,6 +1168,18 @@
_import_structure["models.vivit"].append("VivitImageProcessor")
_import_structure["models.yolos"].extend(["YolosFeatureExtractor", "YolosImageProcessor"])

try:
if not is_torchvision_available():
raise OptionalDependencyNotAvailable()
except OptionalDependencyNotAvailable:
from .utils import dummy_torchvision_objects

_import_structure["utils.dummy_torchvision_objects"] = [
name for name in dir(dummy_torchvision_objects) if not name.startswith("_")
]
else:
_import_structure["image_processing_utils_fast"] = ["BaseImageProcessorFast"]
_import_structure["models.vit"].append("ViTImageProcessorFast")

# PyTorch-backed objects
try:
Expand Down Expand Up @@ -5703,7 +5716,8 @@
except OptionalDependencyNotAvailable:
from .utils.dummy_vision_objects import *
else:
from .image_processing_utils import ImageProcessingMixin
from .image_processing_base import ImageProcessingMixin
from .image_processing_utils import BaseImageProcessor
from .image_utils import ImageFeatureExtractionMixin
from .models.beit import BeitFeatureExtractor, BeitImageProcessor
from .models.bit import BitImageProcessor
Expand Down Expand Up @@ -5793,6 +5807,15 @@
from .models.vivit import VivitImageProcessor
from .models.yolos import YolosFeatureExtractor, YolosImageProcessor

try:
if not is_torchvision_available():
raise OptionalDependencyNotAvailable()
except OptionalDependencyNotAvailable:
from .utils.dummy_torchvision_objects import *
else:
from .image_processing_utils_fast import BaseImageProcessorFast
from .models.vit import ViTImageProcessorFast

# Modeling
try:
if not is_torch_available():
Expand Down
Loading
Loading