Fix continue_final_message for image-text-to-text chat templates #34236

yonigozlan · 2024-10-18T08:24:46Z

What does this PR do?

The content field for an image-text-to-text model is a list, which is not currently taken into account when continue_final_message is set to True in tokenization_utils_base.
Split from image-text-to-text PR

Reproduce error:

from transformers import LlavaProcessor, LlavaForConditionalGeneration
import torch
from PIL import Image
import requests

processor = LlavaProcessor.from_pretrained("llava-hf/llava-interleave-qwen-0.5b-hf")

model = LlavaForConditionalGeneration.from_pretrained("llava-hf/llava-interleave-qwen-0.5b-hf", torch_dtype=torch.float16, low_cpu_mem_usage=True)
model.to("cuda:0")


# Define a chat history and use `apply_chat_template` to get correctly formatted prompt
# Each value in "content" has to be a list of dicts with types ("text", "image")
messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "image",
                "image": "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg",
            },
            {"type": "text", "text": "Describe this image."},
        ],
    },
    {
        "role": "assistant",
        "content": [
            {"type": "text", "text": "There is a dog and"},
        ],
    },
]
prompt = processor.apply_chat_template(messages, continue_final_message=True)

inputs = processor(text=prompt, return_tensors="pt").to("cuda:0").to(torch.float16)

# autoregressively complete prompt
output = model.generate(**inputs, max_new_tokens=100)

print(processor.decode(output[0], skip_special_tokens=True))

@zucchini-nlp @ArthurZucker

zucchini-nlp

LGTM, thanks! Maybe also cc @Rocketknight1

Rocketknight1

Yes, LGTM too!

ArthurZucker

Cool! Can we have a small test please? 🤗

yonigozlan · 2024-10-18T19:07:09Z

Added one test for llava processor :). I could add one for every vlms processor that use chat template, but as they all use the same underlying apply_chat_template, I thought it was not worth the diffs. Wdyt?

ArthurZucker

Thanks 😉

…gingface#34236) * fix continue_final_message for vlms * Add one test for vlms continue_final_message chat template

zucchini-nlp approved these changes Oct 18, 2024

View reviewed changes

yonigozlan requested a review from ArthurZucker October 18, 2024 12:05

Rocketknight1 approved these changes Oct 18, 2024

View reviewed changes

ArthurZucker reviewed Oct 18, 2024

View reviewed changes

yonigozlan requested a review from ArthurZucker October 21, 2024 12:24

yonigozlan force-pushed the fix-tokenization-base branch from 2fa12a4 to 64523c2 Compare October 21, 2024 12:30

yonigozlan added 2 commits October 21, 2024 21:18

fix continue_final_message for vlms

3d59d9f

Add one test for vlms continue_final_message chat template

afc298f

yonigozlan force-pushed the fix-tokenization-base branch from 64523c2 to afc298f Compare October 21, 2024 21:18

ArthurZucker approved these changes Oct 22, 2024

View reviewed changes

yonigozlan merged commit e7c3fa7 into huggingface:main Oct 22, 2024
23 of 25 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix continue_final_message for image-text-to-text chat templates #34236

Fix continue_final_message for image-text-to-text chat templates #34236

yonigozlan commented Oct 18, 2024

zucchini-nlp left a comment

Rocketknight1 left a comment

ArthurZucker left a comment

yonigozlan commented Oct 18, 2024

ArthurZucker left a comment

Fix continue_final_message for image-text-to-text chat templates #34236

Fix continue_final_message for image-text-to-text chat templates #34236

Conversation

yonigozlan commented Oct 18, 2024

What does this PR do?

zucchini-nlp left a comment

Choose a reason for hiding this comment

Rocketknight1 left a comment

Choose a reason for hiding this comment

ArthurZucker left a comment

Choose a reason for hiding this comment

yonigozlan commented Oct 18, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment