-
Notifications
You must be signed in to change notification settings - Fork 27.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enrich TTS pipeline parameters naming #26473
Enrich TTS pipeline parameters naming #26473
Conversation
The documentation is not available anymore as the PR was closed or merged. |
... "max_new_tokens": 35, | ||
... } | ||
|
||
>>> outputs = music_generator("This is a test", forward_params=forward_params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it would help for consistency allowing the pipeline to accept either forward_params
or generate_kwargs
? If forward_params
, then it gets passed to .forward
or .generate
(whichever one the model does). If generate_kwargs
, it only gets passed if the model generates with .generate
. IMO this would help with consistency across pipelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose this would make sense in terms of consistency, but not in terms of simplicity, as it might be confusing for users to have two dictionaries doing the same thing. Anyway, I'm happy with your solution, since users can still refer to the documentation in case of confusion.
WDYT @ArthurZucker and @Narsil ?
I'll open a PR once we agree on that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more aligned with sanchit here, generation params should be passed to generate_kwargs
for sure and not as forward_params
it's confusing and not consistent with our API
Gentle ping here @ylacombe @sanchit-gandhi ? |
Hey @Vaibhavs10, @sanchit-gandhi and @ArthurZucker, I've followed your advice and made it compatible with Let me know your opinion on it! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! LGTM, thanks for iterating @ylacombe! Once we get the final review from @ArthurZucker we can merge
else: | ||
output = self.model(**model_inputs, **kwargs)[0] | ||
output = self.model(**model_inputs, **forward_params)[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we error out if generate_kwargs
are passed but the model doesn't generate? We could then nudge the user to use forward_kwargs
if using a forward-only model in the error message. WDYT?
|
||
# for reproducibility | ||
set_seed(555) | ||
# make sure nothing is done if generate_kwargs passed since not related |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think it would be nice to raise an error here?
The MusicGen TTA doctest is currently timing out (> 120s). Given we already set a low generation max length (35 tokens), I don't think we can really reduce the time for this test much further. Do you think it makes sense to switch to using the VITS model on the doctest, since it'll run in <10s? |
Co-authored-by: Sanchit Gandhi <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review here @sanchit-gandhi ! I don't mind throwing an error here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really have a strong opinion, so good for me here
|
||
with self.assertRaises(TypeError): | ||
# assert error if generate parameter | ||
outputs = speech_generator("This is a test", forward_params={"speaker_id": 5, "do_sample": True}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any better example of forward params?
Speaker id seems to me that it could be part of a generation config (but also is not a tensor and the pr seems to say that we could have tensors here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unfortunately I can only think of Vits and speaker_id
is basically the only optional parameter of the forward pass. BTW, tensor params are tested by test_conversion_additional_tensor
Co-authored-by: Arthur <[email protected]> Co-authored-by: Sanchit Gandhi <[email protected]>
Thanks for the reviews here @sanchit-gandhi and @ArthurZucker, I have updated according to your comments, and will merge once the checks are done! |
Looks like the doc tests might be too slow still: https://app.circleci.com/pipelines/github/huggingface/transformers/75458/workflows/bbc191e3-5c49-410e-a720-307229af6d37/jobs/957271 Should we use a different checkpoint? #26473 (comment) |
Hey @sanchit-gandhi, this slips my mind, looking at this in about an hour ;) |
No timing out anymore, just by syncing the branch with main! Is it okay to merge in that case @amyeroberts and @sanchit-gandhi ? |
If tests are passing and you have core maintainer approval (from @ArthurZucker here) you're good to go! |
* enrich TTS pipeline docstring for clearer forward_params use * change token leghts * update Pipeline parameters * correct docstring and make style * fix tests * make style * change music prompt Co-authored-by: Sanchit Gandhi <[email protected]> * Apply suggestions from code review Co-authored-by: Arthur <[email protected]> Co-authored-by: Sanchit Gandhi <[email protected]> * raise errors if generate_kwargs with forward-only models * make style --------- Co-authored-by: Sanchit Gandhi <[email protected]> Co-authored-by: Arthur <[email protected]>
What does this PR do?
#26369 highlighted that the use of
forward_params
in the TTS pipeline was not clear enough. This PR enrich a bit the docstrings to correct this oversight.Note that following a discussion with @Vaibhavs10, @sanchit-gandhi and @Narsil in the same issue, I came to the conclusion that:
max_new_tokens
would add confusion to the pipeline usage, since generative and non-generative models coexistforward_params
togenerate_params
would be equally as confusingAs @Narsil noted, an user that would like to use advanced parameter controls would probably use the classic transformers usage (tokenizer + model + etc).
Of course, I'm open to discussion and modification of current behavior if this problem recurs in the future.
Fixes #26369
Before submitting
Who can review?
Hey @osanseviero, @Narsil and @ArthurZucker ! Do you feel like this resolve the issue for now?