Skip to content

Commit

Permalink
fix conflict
Browse files Browse the repository at this point in the history
  • Loading branch information
zhenglongjiepheonix committed May 20, 2024
2 parents ee8f9fb + 92d1d97 commit 82b76af
Show file tree
Hide file tree
Showing 91 changed files with 2,120 additions and 241 deletions.
2 changes: 1 addition & 1 deletion docker/transformers-quantization-latest-gpu/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM nvidia/cuda:12.1.0-cudnn8-devel-ubuntu20.04
FROM nvidia/cuda:11.8.0-cudnn8-devel-ubuntu20.04
LABEL maintainer="Hugging Face"

ARG DEBIAN_FRONTEND=noninteractive
Expand Down
5 changes: 5 additions & 0 deletions docs/source/en/model_doc/gemma.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,11 @@ This model was contributed by [Arthur Zucker](https://huggingface.co/ArthurZ), [
[[autodoc]] GemmaForSequenceClassification
- forward

## GemmaForTokenClassification

[[autodoc]] GemmaForTokenClassification
- forward

## FlaxGemmaModel

[[autodoc]] FlaxGemmaModel
Expand Down
5 changes: 5 additions & 0 deletions docs/source/en/model_doc/llama.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,11 @@ A list of official Hugging Face and community (indicated by 🌎) resources to h
[[autodoc]] LlamaForQuestionAnswering
- forward

## LlamaForTokenClassification

[[autodoc]] LlamaForTokenClassification
- forward

## FlaxLlamaModel

[[autodoc]] FlaxLlamaModel
Expand Down
41 changes: 41 additions & 0 deletions docs/source/en/model_doc/llava_next.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,8 @@ The original code can be found [here](https://github.com/haotian-liu/LLaVA/tree/

## Usage example

### Single image inference

Here's how to load the model and perform inference in half-precision (`torch.float16`):

```python
Expand All @@ -94,6 +96,45 @@ output = model.generate(**inputs, max_new_tokens=100)
print(processor.decode(output[0], skip_special_tokens=True))
```

### Multi image inference

LLaVa-Next can perform inference with multiple images as input, where images either belong to the same prompt or different prompts (in batched inference). Here is how you can do it:

```python
import requests
from PIL import Image
import torch
from transformers import AutoProcessor, LlavaNextForConditionalGeneration

# Load the model in half-precision
model = LlavaNextForConditionalGeneration.from_pretrained("llava-hf/llava-v1.6-mistral-7b-hf", torch_dtype=torch.float16, device_map="auto")
processor = AutoProcessor.from_pretrained("llava-hf/llava-v1.6-mistral-7b-hf")

# Get three different images
url = "https://www.ilankelman.org/stopsigns/australia.jpg"
image_stop = Image.open(requests.get(url, stream=True).raw)

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image_cats = Image.open(requests.get(url, stream=True).raw)

url = "https://huggingface.co/microsoft/kosmos-2-patch14-224/resolve/main/snowman.jpg"
image_snowman = Image.open(requests.get(url, stream=True).raw)

# Prepare a batched prompt, where the first one is a multi-turn conversation and the second is not
prompt = [
"[INST] <image>\nWhat is shown in this image? [/INST] There is a red stop sign in the image. [INST] <image>\nWhat about this image? How many cats do you see [/INST]",
"[INST] <image>\nWhat is shown in this image? [/INST]"
]

# We can simply feed images in the order they have to be used in the text prompt
# Each "<image>" token uses one image leaving the next for the subsequent "<image>" tokens
inputs = processor(text=prompt, images=[image_stop, image_cats, image_snowman], padding=True, return_tensors="pt").to(model.device)

# Generate
generate_ids = model.generate(**inputs, max_new_tokens=30)
processor.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)
```

## Model optimization

### Quantization using Bitsandbytes
Expand Down
5 changes: 5 additions & 0 deletions docs/source/en/model_doc/mistral.md
Original file line number Diff line number Diff line change
Expand Up @@ -203,6 +203,11 @@ A list of official Hugging Face and community (indicated by 🌎) resources to h
[[autodoc]] MistralForSequenceClassification
- forward

## MistralForTokenClassification

[[autodoc]] MistralForTokenClassification
- forward

## FlaxMistralModel

[[autodoc]] FlaxMistralModel
Expand Down
5 changes: 5 additions & 0 deletions docs/source/en/model_doc/mixtral.md
Original file line number Diff line number Diff line change
Expand Up @@ -204,3 +204,8 @@ A list of official Hugging Face and community (indicated by 🌎) resources to h

[[autodoc]] MixtralForSequenceClassification
- forward

## MixtralForTokenClassification

[[autodoc]] MixtralForTokenClassification
- forward
5 changes: 5 additions & 0 deletions docs/source/en/model_doc/persimmon.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,3 +96,8 @@ The `LlamaTokenizer` is used as it is a standard wrapper around sentencepiece. T

[[autodoc]] PersimmonForSequenceClassification
- forward

## PersimmonForTokenClassification

[[autodoc]] PersimmonForTokenClassification
- forward
5 changes: 5 additions & 0 deletions docs/source/en/model_doc/qwen2.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,3 +80,8 @@ In the following, we demonstrate how to use `Qwen2-7B-Chat-beta` for the inferen

[[autodoc]] Qwen2ForSequenceClassification
- forward

## Qwen2ForTokenClassification

[[autodoc]] Qwen2ForTokenClassification
- forward
5 changes: 5 additions & 0 deletions docs/source/en/model_doc/qwen2_moe.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,3 +75,8 @@ In the following, we demonstrate how to use `Qwen1.5-MoE-A2.7B-Chat` for the inf

[[autodoc]] Qwen2MoeForSequenceClassification
- forward

## Qwen2MoeForTokenClassification

[[autodoc]] Qwen2MoeForTokenClassification
- forward
5 changes: 5 additions & 0 deletions docs/source/en/model_doc/stablelm.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,3 +104,8 @@ Now, to run the model with Flash Attention 2, refer to the snippet below:

[[autodoc]] StableLmForSequenceClassification
- forward

## StableLmForTokenClassification

[[autodoc]] StableLmForTokenClassification
- forward
5 changes: 5 additions & 0 deletions docs/source/en/model_doc/starcoder2.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,3 +66,8 @@ These ready-to-use checkpoints can be downloaded and used via the HuggingFace Hu

[[autodoc]] Starcoder2ForSequenceClassification
- forward

## Starcoder2ForTokenClassification

[[autodoc]] Starcoder2ForTokenClassification
- forward
25 changes: 24 additions & 1 deletion src/transformers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2031,6 +2031,7 @@
[
"GemmaForCausalLM",
"GemmaForSequenceClassification",
"GemmaForTokenClassification",
"GemmaModel",
"GemmaPreTrainedModel",
]
Expand Down Expand Up @@ -2288,6 +2289,7 @@
"LlamaForCausalLM",
"LlamaForQuestionAnswering",
"LlamaForSequenceClassification",
"LlamaForTokenClassification",
"LlamaModel",
"LlamaPreTrainedModel",
]
Expand Down Expand Up @@ -2435,12 +2437,19 @@
[
"MistralForCausalLM",
"MistralForSequenceClassification",
"MistralForTokenClassification",
"MistralModel",
"MistralPreTrainedModel",
]
)
_import_structure["models.mixtral"].extend(
["MixtralForCausalLM", "MixtralForSequenceClassification", "MixtralModel", "MixtralPreTrainedModel"]
[
"MixtralForCausalLM",
"MixtralForSequenceClassification",
"MixtralForTokenClassification",
"MixtralModel",
"MixtralPreTrainedModel",
]
)
_import_structure["models.mobilebert"].extend(
[
Expand Down Expand Up @@ -2714,6 +2723,7 @@
[
"PersimmonForCausalLM",
"PersimmonForSequenceClassification",
"PersimmonForTokenClassification",
"PersimmonModel",
"PersimmonPreTrainedModel",
]
Expand Down Expand Up @@ -2810,6 +2820,7 @@
[
"Qwen2ForCausalLM",
"Qwen2ForSequenceClassification",
"Qwen2ForTokenClassification",
"Qwen2Model",
"Qwen2PreTrainedModel",
]
Expand All @@ -2818,6 +2829,7 @@
[
"Qwen2MoeForCausalLM",
"Qwen2MoeForSequenceClassification",
"Qwen2MoeForTokenClassification",
"Qwen2MoeModel",
"Qwen2MoePreTrainedModel",
]
Expand Down Expand Up @@ -3066,6 +3078,7 @@
[
"StableLmForCausalLM",
"StableLmForSequenceClassification",
"StableLmForTokenClassification",
"StableLmModel",
"StableLmPreTrainedModel",
]
Expand All @@ -3074,6 +3087,7 @@
[
"Starcoder2ForCausalLM",
"Starcoder2ForSequenceClassification",
"Starcoder2ForTokenClassification",
"Starcoder2Model",
"Starcoder2PreTrainedModel",
]
Expand Down Expand Up @@ -6489,6 +6503,7 @@
from .models.gemma import (
GemmaForCausalLM,
GemmaForSequenceClassification,
GemmaForTokenClassification,
GemmaModel,
GemmaPreTrainedModel,
)
Expand Down Expand Up @@ -6686,6 +6701,7 @@
LlamaForCausalLM,
LlamaForQuestionAnswering,
LlamaForSequenceClassification,
LlamaForTokenClassification,
LlamaModel,
LlamaPreTrainedModel,
)
Expand Down Expand Up @@ -6801,12 +6817,14 @@
from .models.mistral import (
MistralForCausalLM,
MistralForSequenceClassification,
MistralForTokenClassification,
MistralModel,
MistralPreTrainedModel,
)
from .models.mixtral import (
MixtralForCausalLM,
MixtralForSequenceClassification,
MixtralForTokenClassification,
MixtralModel,
MixtralPreTrainedModel,
)
Expand Down Expand Up @@ -7025,6 +7043,7 @@
from .models.persimmon import (
PersimmonForCausalLM,
PersimmonForSequenceClassification,
PersimmonForTokenClassification,
PersimmonModel,
PersimmonPreTrainedModel,
)
Expand Down Expand Up @@ -7099,12 +7118,14 @@
from .models.qwen2 import (
Qwen2ForCausalLM,
Qwen2ForSequenceClassification,
Qwen2ForTokenClassification,
Qwen2Model,
Qwen2PreTrainedModel,
)
from .models.qwen2_moe import (
Qwen2MoeForCausalLM,
Qwen2MoeForSequenceClassification,
Qwen2MoeForTokenClassification,
Qwen2MoeModel,
Qwen2MoePreTrainedModel,
)
Expand Down Expand Up @@ -7306,12 +7327,14 @@
from .models.stablelm import (
StableLmForCausalLM,
StableLmForSequenceClassification,
StableLmForTokenClassification,
StableLmModel,
StableLmPreTrainedModel,
)
from .models.starcoder2 import (
Starcoder2ForCausalLM,
Starcoder2ForSequenceClassification,
Starcoder2ForTokenClassification,
Starcoder2Model,
Starcoder2PreTrainedModel,
)
Expand Down
5 changes: 5 additions & 0 deletions src/transformers/configuration_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
CONFIG_NAME,
PushToHubMixin,
add_model_info_to_auto_map,
add_model_info_to_custom_pipelines,
cached_file,
copy_func,
download_url,
Expand Down Expand Up @@ -736,6 +737,10 @@ def _get_config_dict(
config_dict["auto_map"] = add_model_info_to_auto_map(
config_dict["auto_map"], pretrained_model_name_or_path
)
if "custom_pipelines" in config_dict and not is_local:
config_dict["custom_pipelines"] = add_model_info_to_custom_pipelines(
config_dict["custom_pipelines"], pretrained_model_name_or_path
)
return config_dict, kwargs

@classmethod
Expand Down
14 changes: 10 additions & 4 deletions src/transformers/feature_extraction_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@
PushToHubMixin,
TensorType,
add_model_info_to_auto_map,
add_model_info_to_custom_pipelines,
cached_file,
copy_func,
download_url,
Expand Down Expand Up @@ -539,10 +540,15 @@ def get_feature_extractor_dict(
f"loading configuration file {feature_extractor_file} from cache at {resolved_feature_extractor_file}"
)

if "auto_map" in feature_extractor_dict and not is_local:
feature_extractor_dict["auto_map"] = add_model_info_to_auto_map(
feature_extractor_dict["auto_map"], pretrained_model_name_or_path
)
if not is_local:
if "auto_map" in feature_extractor_dict:
feature_extractor_dict["auto_map"] = add_model_info_to_auto_map(
feature_extractor_dict["auto_map"], pretrained_model_name_or_path
)
if "custom_pipelines" in feature_extractor_dict:
feature_extractor_dict["custom_pipelines"] = add_model_info_to_custom_pipelines(
feature_extractor_dict["custom_pipelines"], pretrained_model_name_or_path
)

return feature_extractor_dict, kwargs

Expand Down
15 changes: 10 additions & 5 deletions src/transformers/image_processing_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@
IMAGE_PROCESSOR_NAME,
PushToHubMixin,
add_model_info_to_auto_map,
add_model_info_to_custom_pipelines,
cached_file,
copy_func,
download_url,
Expand Down Expand Up @@ -375,11 +376,15 @@ def get_image_processor_dict(
f"loading configuration file {image_processor_file} from cache at {resolved_image_processor_file}"
)

if "auto_map" in image_processor_dict and not is_local:
image_processor_dict["auto_map"] = add_model_info_to_auto_map(
image_processor_dict["auto_map"], pretrained_model_name_or_path
)

if not is_local:
if "auto_map" in image_processor_dict:
image_processor_dict["auto_map"] = add_model_info_to_auto_map(
image_processor_dict["auto_map"], pretrained_model_name_or_path
)
if "custom_pipelines" in image_processor_dict:
image_processor_dict["custom_pipelines"] = add_model_info_to_custom_pipelines(
image_processor_dict["custom_pipelines"], pretrained_model_name_or_path
)
return image_processor_dict, kwargs

@classmethod
Expand Down
Loading

0 comments on commit 82b76af

Please sign in to comment.