Skip to content

Commit

Permalink
Fast image processor (#28847)
Browse files Browse the repository at this point in the history
* Draft fast image processors

* Draft working fast version

* py3.8 compatible cache

* Enable loading fast image processors through auto

* Tidy up; rescale behaviour based on input type

* Enable tests for fast image processors

* Smarter rescaling

* Don't default to Fast

* Safer imports

* Add necessary Pillow requirement

* Woops

* Add AutoImageProcessor test

* Fix up

* Fix test for imagegpt

* Fix test

* Review comments

* Add warning for TF and JAX input types

* Rearrange

* Return transforms

* NumpyToTensor transformation

* Rebase - include changes from upstream in ImageProcessingMixin

* Safe typing

* Fix up

* convert mean/std to tesnor to rescale

* Don't store transforms in state

* Fix up

* Update src/transformers/image_processing_utils_fast.py

Co-authored-by: Arthur <[email protected]>

* Update src/transformers/models/auto/image_processing_auto.py

Co-authored-by: Arthur <[email protected]>

* Update src/transformers/models/auto/image_processing_auto.py

Co-authored-by: Arthur <[email protected]>

* Update src/transformers/models/auto/image_processing_auto.py

Co-authored-by: Arthur <[email protected]>

* Warn if fast image processor available

* Update src/transformers/models/vit/image_processing_vit_fast.py

* Transpose incoming numpy images to be in CHW format

* Update mapping names based on packages, auto set fast to None

* Fix up

* Fix

* Add AutoImageProcessor.from_pretrained(checkpoint, use_fast=True) test

* Update src/transformers/models/vit/image_processing_vit_fast.py

Co-authored-by: Pavel Iakubovskii <[email protected]>

* Add equivalence and speed tests

* Fix up

---------

Co-authored-by: Arthur <[email protected]>
Co-authored-by: Pavel Iakubovskii <[email protected]>
  • Loading branch information
3 people authored Jun 11, 2024
1 parent edc1dff commit f53fe35
Show file tree
Hide file tree
Showing 64 changed files with 1,643 additions and 811 deletions.
5 changes: 5 additions & 0 deletions docs/source/en/main_classes/image_processor.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,3 +32,8 @@ An image processor is in charge of preparing input features for vision models an
## BaseImageProcessor

[[autodoc]] image_processing_utils.BaseImageProcessor


## BaseImageProcessorFast

[[autodoc]] image_processing_utils_fast.BaseImageProcessorFast
7 changes: 6 additions & 1 deletion docs/source/en/model_doc/vit.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ Following the original Vision Transformer, some follow-up works have been made:
This model was contributed by [nielsr](https://huggingface.co/nielsr). The original code (written in JAX) can be
found [here](https://github.com/google-research/vision_transformer).

Note that we converted the weights from Ross Wightman's [timm library](https://github.com/rwightman/pytorch-image-models),
Note that we converted the weights from Ross Wightman's [timm library](https://github.com/rwightman/pytorch-image-models),
who already converted the weights from JAX to PyTorch. Credits go to him!

## Usage tips
Expand Down Expand Up @@ -158,6 +158,11 @@ A list of official Hugging Face and community (indicated by 🌎) resources to h
[[autodoc]] ViTImageProcessor
- preprocess

## ViTImageProcessorFast

[[autodoc]] ViTImageProcessorFast
- preprocess

<frameworkcontent>
<pt>

Expand Down
1 change: 1 addition & 0 deletions examples/pytorch/_tests_requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -29,3 +29,4 @@ timm
albumentations >= 1.4.5
torchmetrics
pycocotools
Pillow>=10.0.1,<=15.0
27 changes: 25 additions & 2 deletions src/transformers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -1104,7 +1104,8 @@
name for name in dir(dummy_vision_objects) if not name.startswith("_")
]
else:
_import_structure["image_processing_utils"] = ["ImageProcessingMixin"]
_import_structure["image_processing_base"] = ["ImageProcessingMixin"]
_import_structure["image_processing_utils"] = ["BaseImageProcessor"]
_import_structure["image_utils"] = ["ImageFeatureExtractionMixin"]
_import_structure["models.beit"].extend(["BeitFeatureExtractor", "BeitImageProcessor"])
_import_structure["models.bit"].extend(["BitImageProcessor"])
Expand Down Expand Up @@ -1167,6 +1168,18 @@
_import_structure["models.vivit"].append("VivitImageProcessor")
_import_structure["models.yolos"].extend(["YolosFeatureExtractor", "YolosImageProcessor"])

try:
if not is_torchvision_available():
raise OptionalDependencyNotAvailable()
except OptionalDependencyNotAvailable:
from .utils import dummy_torchvision_objects

_import_structure["utils.dummy_torchvision_objects"] = [
name for name in dir(dummy_torchvision_objects) if not name.startswith("_")
]
else:
_import_structure["image_processing_utils_fast"] = ["BaseImageProcessorFast"]
_import_structure["models.vit"].append("ViTImageProcessorFast")

# PyTorch-backed objects
try:
Expand Down Expand Up @@ -5703,7 +5716,8 @@
except OptionalDependencyNotAvailable:
from .utils.dummy_vision_objects import *
else:
from .image_processing_utils import ImageProcessingMixin
from .image_processing_base import ImageProcessingMixin
from .image_processing_utils import BaseImageProcessor
from .image_utils import ImageFeatureExtractionMixin
from .models.beit import BeitFeatureExtractor, BeitImageProcessor
from .models.bit import BitImageProcessor
Expand Down Expand Up @@ -5793,6 +5807,15 @@
from .models.vivit import VivitImageProcessor
from .models.yolos import YolosFeatureExtractor, YolosImageProcessor

try:
if not is_torchvision_available():
raise OptionalDependencyNotAvailable()
except OptionalDependencyNotAvailable:
from .utils.dummy_torchvision_objects import *
else:
from .image_processing_utils_fast import BaseImageProcessorFast
from .models.vit import ViTImageProcessorFast

# Modeling
try:
if not is_torch_available():
Expand Down
Loading

0 comments on commit f53fe35

Please sign in to comment.