Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add chat support to text generation pipeline #28945

Merged
merged 14 commits into from
Feb 16, 2024

Conversation

Rocketknight1
Copy link
Member

@Rocketknight1 Rocketknight1 commented Feb 9, 2024

This PR modifies the text generation pipeline to support chats. It does this by inspecting the inputs - if they look like strings, it uses the original causal LM pipeline, and if they look like lists of message dicts, it applies a chat template instead before proceeding with generation.

Most changes are in the preprocessing/postprocessing - the actual generation itself is largely unchanged.

TODO:

  • Expand tests to cover other edge cases
  • Confirm the return format we want for this - just the model response, or the entire chat?
  • Add KV cache support, as this is important for performant multi-turn chat
  • Deprecate ConversationalPipeline and update the chat template docs to refer to this instead?

cc @ArthurZucker @gante @LysandreJik

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@julien-c julien-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks neat from a (very) superficial glance

I think this will be quite useful!

@julien-c
Copy link
Member

julien-c commented Feb 9, 2024

(and yes we should remove the old ConversationalPipeline sooner rather than later given it already doesn't work anymore due to conversational pipeline-type being removed from the Hub, IIUC)

@Rocketknight1
Copy link
Member Author

@julien-c Done! This PR now adds a DeprecationWarning to ConversationalPipeline. I also updated the chat template docs for the new pipeline.

@julien-c
Copy link
Member

very nice!

Copy link
Member

@gante gante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thank for adding this 🐈

src/transformers/pipelines/conversational.py Outdated Show resolved Hide resolved
src/transformers/pipelines/text_generation.py Outdated Show resolved Hide resolved
if isinstance(text_inputs[0], dict):
return super().__call__(Chat(text_inputs), **kwargs)
else:
chats = [Chat(chat) for chat in text_inputs] # 🐈 🐈 🐈
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

best comment 😂

@Rocketknight1
Copy link
Member Author

One question for people, maybe @gante: Are you okay with the return format I'm using? Right now, if you pass a chat like this:

[ 
    {"role": "system", "content": "This is a system message."},
    {"role": "user", "content": "This is a test"},
]

You get a response that's the same chat, continued:

[
    {"role": "system", "content": "This is a system message."},
    {"role": "user", "content": "This is a test"},
    {"role": "assistant", "content": "This is a reply"},
]

I think this is the right thing to do, because it matches the behaviour of the existing text-generation pipeline (it returns the prompt at the start of the generated string). Let me know if you have a different opinion, though!

@gante
Copy link
Member

gante commented Feb 15, 2024

IMO it looks good to me

@Rocketknight1
Copy link
Member Author

Cool!

@Rocketknight1
Copy link
Member Author

In that case, I think we're ready for final review (cc @amyeroberts) - I'm leaving the KV cache to another PR.

@Rocketknight1
Copy link
Member Author

cc @LysandreJik @julien-c as well if there's anything else you want me to add before we merge this!

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautiful - thanks for adding this support!

src/transformers/tokenization_utils_base.py Outdated Show resolved Hide resolved
src/transformers/pipelines/text_generation.py Outdated Show resolved Hide resolved
@@ -216,7 +230,15 @@ def __call__(self, text_inputs, **kwargs):
- **generated_token_ids** (`torch.Tensor` or `tf.Tensor`, present when `return_tensors=True`) -- The token
ids of the generated text.
"""
return super().__call__(text_inputs, **kwargs)
if isinstance(text_inputs, (list, tuple)) and isinstance(text_inputs[0], (list, tuple, dict)):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure - is it not possible for someone to pass this to the pipeline:

# Pass a list-of-list-of-strings
generator([["this is a dog"], ["this is a code example"], ["banana for scale"]])

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that on main - it just results in a TypeError: can only concatenate str (not "list") to str. The existing pipeline will only accept either a single string or a non-nested list/tuple of strings, so I don't think this check makes a mistake for any valid inputs!

@Rocketknight1 Rocketknight1 merged commit 2f1003b into main Feb 16, 2024
22 checks passed
@Rocketknight1 Rocketknight1 deleted the support_chat_in_text_gen_pipeline branch February 16, 2024 16:41
zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request Feb 19, 2024
* Add chat support to text generation pipeline

* Better handling of single elements

* Deprecate ConversationalPipeline

* stash commit

* Add missing add_special_tokens kwarg

* Update chat templating docs to refer to TextGenerationPipeline instead of ConversationalPipeline

* Add ✨TF✨ tests

* @require_tf

* Add type hint

* Add specific deprecation version

* Remove unnecessary do_sample

* Remove todo - the discrepancy has been resolved

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: amyeroberts <[email protected]>

* Update src/transformers/pipelines/text_generation.py

Co-authored-by: amyeroberts <[email protected]>

---------

Co-authored-by: amyeroberts <[email protected]>
itazap pushed a commit that referenced this pull request May 14, 2024
* Add chat support to text generation pipeline

* Better handling of single elements

* Deprecate ConversationalPipeline

* stash commit

* Add missing add_special_tokens kwarg

* Update chat templating docs to refer to TextGenerationPipeline instead of ConversationalPipeline

* Add ✨TF✨ tests

* @require_tf

* Add type hint

* Add specific deprecation version

* Remove unnecessary do_sample

* Remove todo - the discrepancy has been resolved

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: amyeroberts <[email protected]>

* Update src/transformers/pipelines/text_generation.py

Co-authored-by: amyeroberts <[email protected]>

---------

Co-authored-by: amyeroberts <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants