-
Notifications
You must be signed in to change notification settings - Fork 27.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add chat support to text generation pipeline #28945
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks neat from a (very) superficial glance
I think this will be quite useful!
(and yes we should remove the old |
@julien-c Done! This PR now adds a |
very nice! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Thank for adding this 🐈
if isinstance(text_inputs[0], dict): | ||
return super().__call__(Chat(text_inputs), **kwargs) | ||
else: | ||
chats = [Chat(chat) for chat in text_inputs] # 🐈 🐈 🐈 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
best comment 😂
One question for people, maybe @gante: Are you okay with the return format I'm using? Right now, if you pass a chat like this: [
{"role": "system", "content": "This is a system message."},
{"role": "user", "content": "This is a test"},
] You get a response that's the same chat, continued: [
{"role": "system", "content": "This is a system message."},
{"role": "user", "content": "This is a test"},
{"role": "assistant", "content": "This is a reply"},
] I think this is the right thing to do, because it matches the behaviour of the existing |
IMO it looks good to me |
Cool! |
In that case, I think we're ready for final review (cc @amyeroberts) - I'm leaving the KV cache to another PR. |
cc @LysandreJik @julien-c as well if there's anything else you want me to add before we merge this! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beautiful - thanks for adding this support!
@@ -216,7 +230,15 @@ def __call__(self, text_inputs, **kwargs): | |||
- **generated_token_ids** (`torch.Tensor` or `tf.Tensor`, present when `return_tensors=True`) -- The token | |||
ids of the generated text. | |||
""" | |||
return super().__call__(text_inputs, **kwargs) | |||
if isinstance(text_inputs, (list, tuple)) and isinstance(text_inputs[0], (list, tuple, dict)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure - is it not possible for someone to pass this to the pipeline:
# Pass a list-of-list-of-strings
generator([["this is a dog"], ["this is a code example"], ["banana for scale"]])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried that on main
- it just results in a TypeError: can only concatenate str (not "list") to str
. The existing pipeline will only accept either a single string or a non-nested list/tuple of strings, so I don't think this check makes a mistake for any valid inputs!
Co-authored-by: amyeroberts <[email protected]>
Co-authored-by: amyeroberts <[email protected]>
* Add chat support to text generation pipeline * Better handling of single elements * Deprecate ConversationalPipeline * stash commit * Add missing add_special_tokens kwarg * Update chat templating docs to refer to TextGenerationPipeline instead of ConversationalPipeline * Add ✨TF✨ tests * @require_tf * Add type hint * Add specific deprecation version * Remove unnecessary do_sample * Remove todo - the discrepancy has been resolved * Update src/transformers/tokenization_utils_base.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/pipelines/text_generation.py Co-authored-by: amyeroberts <[email protected]> --------- Co-authored-by: amyeroberts <[email protected]>
* Add chat support to text generation pipeline * Better handling of single elements * Deprecate ConversationalPipeline * stash commit * Add missing add_special_tokens kwarg * Update chat templating docs to refer to TextGenerationPipeline instead of ConversationalPipeline * Add ✨TF✨ tests * @require_tf * Add type hint * Add specific deprecation version * Remove unnecessary do_sample * Remove todo - the discrepancy has been resolved * Update src/transformers/tokenization_utils_base.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/pipelines/text_generation.py Co-authored-by: amyeroberts <[email protected]> --------- Co-authored-by: amyeroberts <[email protected]>
This PR modifies the text generation pipeline to support chats. It does this by inspecting the inputs - if they look like strings, it uses the original causal LM pipeline, and if they look like lists of message dicts, it applies a chat template instead before proceeding with generation.
Most changes are in the preprocessing/postprocessing - the actual generation itself is largely unchanged.
TODO:
Add KV cache support, as this is important for performant multi-turn chatConversationalPipeline
and update the chat template docs to refer to this instead?cc @ArthurZucker @gante @LysandreJik