-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stream ModelOutputs #29545
Stream ModelOutputs #29545
Conversation
NB: I think I've uncovered an issue in the contrastive decoding implementation. Currently, scores and (raw) logits are the same values, but should not be. I think this is because the scores are warped by logic encapsulated in the |
Hi @dmarx 👋 Thank you for opening this PR! We're not accepting complex updates to streaming at the moment, as we have an ongoing design not far from being added (and it touches other non-streaming goals for In a nutshell, we are going to add the option to |
Nice! Looking forward to the change, fingers crossed users will be able to yield output dicts with scores and logits in addition to token_ids (i.e. the motivation for this PR) :) Any chance you could estimate an ETA for the |
actually better yet, @gante could you maybe point me to the issue or PR I should follow to keep tabs on the "yield from generate" updates? |
@dmarx Here's the roadmap:
No trackers for 2. and 3. yet. I'd estimate 3 months super optimistically, 6 months if a flurry of new generation techniques and model modifications come out in the near future :) |
@gante refactoring messy research code is sorta my jam (just ask poli). Let me know if I could help accelerate some of this roadmap, would be happy to help with (2) if you can elaborate on your design plan a bit more. Also, let me know if it would be helpful to re-articulate this PR's motivations as an issue for your user story tracking. I can just leave this PR open if you prefer. |
@dmarx extra hands would indeed be handy (pun intended)! I'm going to chat with @zucchini-nlp soon (tomorrow?), so we can gather a set of requirements to then share publicly a concrete plan for the factorization of If I don't reply within a week, please don't hesitate to ping me -- I do want to take your offer! |
@gante @zucchini-nlp bump re the GenerationMixin.generate refactoring roadmap and design plans |
@dmarx not forgotten, we are finalizing gathering requirements internally across the different teams/repos :) |
@gante just checking in. |
had kept this open for communication purposes, closing in favor of #30810 |
What does this PR do?
(NB: PR is still work in progress, but close to ready)
The "streaming" feature for text generation is currently constrained to streaming token ids. If a user requests an output dict, the stream still only generates token ids without other attributes that may have been requested such as scores, raw logits, activations. After the stream has been consumed, the function returns its final output, which will be the originally requested output dict.
This PR aligns the return type of the streamer with the requested return type by encapsulating the logic that determines how the return value is constructed. In so doing, this change also permits users to stream richer representations than just token ids. EDIT: Additionally, this exposes a useful probe for testing, as demonstrated with the discovery of #29551
Summary of changes:
._prepare_output()
function toGenerationMixin
, encapsulates output formatting logic._prepare_output()
replaces nested conditionals that previously built return values (beam samplers excluded)Streamer.put()
receives its input from._prepare_output()
invocation rather than restricted to token IDs tensorStreamer.put()
are no longer sent to the CPU prior to being passed to.put()
Streamer
.OutputStreamer
andOutputIteratorStreamer
classesBefore submitting
Pull Request section?
OutputIteratorStreamerTester
to be modularized with fixtures, DRYed_prepare_output()
(? maybe it's fine as is?)[ ] revisit ECHO behavior?output.score is None
TextIteratorStreamer
using the newOutputIteraterSreamer
Who can review?
@gante