-
Notifications
You must be signed in to change notification settings - Fork 27.4k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Image Feature Extraction pipeline (#28216)
* Draft pipeline * Fixup * Fix docstrings * Update doctest * Update pipeline_model_mapping * Update docstring * Update tests * Update src/transformers/pipelines/image_feature_extraction.py Co-authored-by: Omar Sanseviero <[email protected]> * Fix docstrings - review comments * Remove pipeline mapping for composite vision models * Add to pipeline tests * Remove for flava (multimodal) * safe pil import * Add requirements for pipeline run * Account for super slow efficientnet * Review comments * Fix tests * Swap order of kwargs * Use build_pipeline_init_args * Add back FE pipeline for Vilt * Include image_processor_kwargs in docstring * Mark test as flaky * Update TODO * Update tests/pipelines/test_pipelines_image_feature_extraction.py Co-authored-by: Arthur <[email protected]> * Add license header --------- Co-authored-by: Omar Sanseviero <[email protected]> Co-authored-by: Arthur <[email protected]>
- Loading branch information
1 parent
7addc93
commit ba3264b
Showing
60 changed files
with
387 additions
and
53 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,92 @@ | ||
from typing import Dict | ||
|
||
from ..utils import add_end_docstrings, is_vision_available | ||
from .base import GenericTensor, Pipeline, build_pipeline_init_args | ||
|
||
|
||
if is_vision_available(): | ||
from ..image_utils import load_image | ||
|
||
|
||
@add_end_docstrings( | ||
build_pipeline_init_args(has_image_processor=True), | ||
""" | ||
image_processor_kwargs (`dict`, *optional*): | ||
Additional dictionary of keyword arguments passed along to the image processor e.g. | ||
{"size": {"height": 100, "width": 100}} | ||
""", | ||
) | ||
class ImageFeatureExtractionPipeline(Pipeline): | ||
""" | ||
Image feature extraction pipeline uses no model head. This pipeline extracts the hidden states from the base | ||
transformer, which can be used as features in downstream tasks. | ||
Example: | ||
```python | ||
>>> from transformers import pipeline | ||
>>> extractor = pipeline(model="google/vit-base-patch16-224", task="image-feature-extraction") | ||
>>> result = extractor("https://huggingface.co/datasets/Narsil/image_dummy/raw/main/parrots.png", return_tensors=True) | ||
>>> result.shape # This is a tensor of shape [1, sequence_lenth, hidden_dimension] representing the input image. | ||
torch.Size([1, 197, 768]) | ||
``` | ||
Learn more about the basics of using a pipeline in the [pipeline tutorial](../pipeline_tutorial) | ||
This image feature extraction pipeline can currently be loaded from [`pipeline`] using the task identifier: | ||
`"image-feature-extraction"`. | ||
All vision models may be used for this pipeline. See a list of all models, including community-contributed models on | ||
[huggingface.co/models](https://huggingface.co/models). | ||
""" | ||
|
||
def _sanitize_parameters(self, image_processor_kwargs=None, return_tensors=None, **kwargs): | ||
preprocess_params = {} if image_processor_kwargs is None else image_processor_kwargs | ||
postprocess_params = {"return_tensors": return_tensors} if return_tensors is not None else {} | ||
|
||
if "timeout" in kwargs: | ||
preprocess_params["timeout"] = kwargs["timeout"] | ||
|
||
return preprocess_params, {}, postprocess_params | ||
|
||
def preprocess(self, image, timeout=None, **image_processor_kwargs) -> Dict[str, GenericTensor]: | ||
image = load_image(image, timeout=timeout) | ||
model_inputs = self.image_processor(image, return_tensors=self.framework, **image_processor_kwargs) | ||
return model_inputs | ||
|
||
def _forward(self, model_inputs): | ||
model_outputs = self.model(**model_inputs) | ||
return model_outputs | ||
|
||
def postprocess(self, model_outputs, return_tensors=False): | ||
# [0] is the first available tensor, logits or last_hidden_state. | ||
if return_tensors: | ||
return model_outputs[0] | ||
if self.framework == "pt": | ||
return model_outputs[0].tolist() | ||
elif self.framework == "tf": | ||
return model_outputs[0].numpy().tolist() | ||
|
||
def __call__(self, *args, **kwargs): | ||
""" | ||
Extract the features of the input(s). | ||
Args: | ||
images (`str`, `List[str]`, `PIL.Image` or `List[PIL.Image]`): | ||
The pipeline handles three types of images: | ||
- A string containing a http link pointing to an image | ||
- A string containing a local path to an image | ||
- An image loaded in PIL directly | ||
The pipeline accepts either a single image or a batch of images, which must then be passed as a string. | ||
Images in a batch must all be in the same format: all as http links, all as local paths, or all as PIL | ||
images. | ||
timeout (`float`, *optional*, defaults to None): | ||
The maximum time in seconds to wait for fetching images from the web. If None, no timeout is used and | ||
the call may block forever. | ||
Return: | ||
A nested list of `float`: The features computed by the model. | ||
""" | ||
return super().__call__(*args, **kwargs) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.