Enrich TTS pipeline parameters naming #26473

ylacombe · 2023-09-28T15:34:31Z

What does this PR do?

#26369 highlighted that the use of forward_params in the TTS pipeline was not clear enough. This PR enrich a bit the docstrings to correct this oversight.

Note that following a discussion with @Vaibhavs10, @sanchit-gandhi and @Narsil in the same issue, I came to the conclusion that:

adding max_new_tokens would add confusion to the pipeline usage, since generative and non-generative models coexist
in the same manner, I believe that renaming forward_params to generate_params would be equally as confusing

As @Narsil noted, an user that would like to use advanced parameter controls would probably use the classic transformers usage (tokenizer + model + etc).

Of course, I'm open to discussion and modification of current behavior if this problem recurs in the future.

Fixes #26369

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).

Who can review?

Hey @osanseviero, @Narsil and @ArthurZucker ! Do you feel like this resolve the issue for now?

HuggingFaceDocBuilderDev · 2023-09-28T15:52:26Z

The documentation is not available anymore as the PR was closed or merged.

sanchit-gandhi · 2023-09-29T17:10:47Z

src/transformers/pipelines/text_to_audio.py

+    ...     "max_new_tokens": 35,
+    ... }
+
+    >>> outputs = music_generator("This is a test", forward_params=forward_params)


Do you think it would help for consistency allowing the pipeline to accept either forward_params or generate_kwargs? If forward_params, then it gets passed to .forward or .generate (whichever one the model does). If generate_kwargs, it only gets passed if the model generates with .generate. IMO this would help with consistency across pipelines

I suppose this would make sense in terms of consistency, but not in terms of simplicity, as it might be confusing for users to have two dictionaries doing the same thing. Anyway, I'm happy with your solution, since users can still refer to the documentation in case of confusion.

WDYT @ArthurZucker and @Narsil ?

I'll open a PR once we agree on that!

more aligned with sanchit here, generation params should be passed to generate_kwargs for sure and not as forward_params it's confusing and not consistent with our API

Vaibhavs10 · 2023-10-11T07:32:40Z

Gentle ping here @ylacombe @sanchit-gandhi ?
I'd like to promote more complex operations and the ability to play around with pipelines for TTA/S. Would be cool to be able to do it via generate_kwargs to showcase similarity across various pipeline usage.

ylacombe · 2023-10-11T15:06:45Z

Hey @Vaibhavs10, @sanchit-gandhi and @ArthurZucker, I've followed your advice and made it compatible with generate_kwargs if the model is generative. forward_params is still usable for both type of models but generate_kwargs has priority over forward_params if usable.

Let me know your opinion on it!

sanchit-gandhi

Nice! LGTM, thanks for iterating @ylacombe! Once we get the final review from @ArthurZucker we can merge

sanchit-gandhi · 2023-10-12T12:07:37Z

src/transformers/pipelines/text_to_audio.py

        else:
-            output = self.model(**model_inputs, **kwargs)[0]
+            output = self.model(**model_inputs, **forward_params)[0]


Should we error out if generate_kwargs are passed but the model doesn't generate? We could then nudge the user to use forward_kwargs if using a forward-only model in the error message. WDYT?

sanchit-gandhi · 2023-10-12T12:08:32Z

tests/pipelines/test_pipelines_text_to_audio.py

+
+        # for reproducibility
+        set_seed(555)
+        # make sure nothing is done if generate_kwargs passed since not related


Think it would be nice to raise an error here?

tests/pipelines/test_pipelines_text_to_audio.py

src/transformers/pipelines/text_to_audio.py

sanchit-gandhi · 2023-10-12T12:12:28Z

The MusicGen TTA doctest is currently timing out (> 120s). Given we already set a low generation max length (35 tokens), I don't think we can really reduce the time for this test much further. Do you think it makes sense to switch to using the VITS model on the doctest, since it'll run in <10s?

Co-authored-by: Sanchit Gandhi <[email protected]>

ylacombe

Thanks for the review here @sanchit-gandhi ! I don't mind throwing an error here

tests/pipelines/test_pipelines_text_to_audio.py

ArthurZucker

I don't really have a strong opinion, so good for me here

src/transformers/pipelines/text_to_audio.py

ArthurZucker · 2023-10-16T06:53:01Z

tests/pipelines/test_pipelines_text_to_audio.py

+
+        with self.assertRaises(TypeError):
+            # assert error if generate parameter
+            outputs = speech_generator("This is a test", forward_params={"speaker_id": 5, "do_sample": True})


is there any better example of forward params?
Speaker id seems to me that it could be part of a generation config (but also is not a tensor and the pr seems to say that we could have tensors here).

unfortunately I can only think of Vits and speaker_id is basically the only optional parameter of the forward pass. BTW, tensor params are tested by test_conversion_additional_tensor

Co-authored-by: Arthur <[email protected]> Co-authored-by: Sanchit Gandhi <[email protected]>

ylacombe · 2023-10-16T08:41:38Z

Thanks for the reviews here @sanchit-gandhi and @ArthurZucker, I have updated according to your comments, and will merge once the checks are done!

sanchit-gandhi · 2023-10-23T17:53:19Z

Looks like the doc tests might be too slow still: https://app.circleci.com/pipelines/github/huggingface/transformers/75458/workflows/bbc191e3-5c49-410e-a720-307229af6d37/jobs/957271

Should we use a different checkpoint? #26473 (comment)

ylacombe · 2023-11-02T14:53:24Z

Hey @sanchit-gandhi, this slips my mind, looking at this in about an hour ;)

ylacombe · 2023-11-02T16:05:03Z

No timing out anymore, just by syncing the branch with main! Is it okay to merge in that case @amyeroberts and @sanchit-gandhi ?

amyeroberts · 2023-11-02T16:57:15Z

If tests are passing and you have core maintainer approval (from @ArthurZucker here) you're good to go!

* enrich TTS pipeline docstring for clearer forward_params use * change token leghts * update Pipeline parameters * correct docstring and make style * fix tests * make style * change music prompt Co-authored-by: Sanchit Gandhi <[email protected]> * Apply suggestions from code review Co-authored-by: Arthur <[email protected]> Co-authored-by: Sanchit Gandhi <[email protected]> * raise errors if generate_kwargs with forward-only models * make style --------- Co-authored-by: Sanchit Gandhi <[email protected]> Co-authored-by: Arthur <[email protected]>

enrich TTS pipeline docstring for clearer forward_params use

d9f4091

change token leghts

042212f

sanchit-gandhi reviewed Sep 29, 2023

View reviewed changes

ylacombe and others added 4 commits October 11, 2023 15:48

Merge branch 'huggingface:main' into enrich-TTS-pipeline-docstrings

56a4cff

update Pipeline parameters

ea1201d

correct docstring and make style

b070be3

fix tests

69fcb71

ylacombe changed the title ~~Enrich TTS pipeline docstring for clearer forward_params usage~~ Enrich TTS pipeline parameters naming Oct 11, 2023

make style

afb105c

sanchit-gandhi approved these changes Oct 12, 2023

View reviewed changes

sanchit-gandhi reviewed Oct 12, 2023

View reviewed changes

src/transformers/pipelines/text_to_audio.py Outdated Show resolved Hide resolved

change music prompt

8bc939c

Co-authored-by: Sanchit Gandhi <[email protected]>

ylacombe commented Oct 13, 2023

View reviewed changes

tests/pipelines/test_pipelines_text_to_audio.py Outdated Show resolved Hide resolved

ArthurZucker approved these changes Oct 16, 2023

View reviewed changes

ylacombe and others added 3 commits October 16, 2023 10:23

Apply suggestions from code review

a6311f9

Co-authored-by: Arthur <[email protected]> Co-authored-by: Sanchit Gandhi <[email protected]>

raise errors if generate_kwargs with forward-only models

6f6de5f

make style

4ef5ff4

Merge branch 'huggingface:main' into enrich-TTS-pipeline-docstrings

3986132

ylacombe merged commit 0ed6729 into huggingface:main Nov 2, 2023
3 checks passed

ylacombe deleted the enrich-TTS-pipeline-docstrings branch November 2, 2023 17:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enrich TTS pipeline parameters naming #26473

Enrich TTS pipeline parameters naming #26473

ylacombe commented Sep 28, 2023

HuggingFaceDocBuilderDev commented Sep 28, 2023 •

edited

Loading

sanchit-gandhi Sep 29, 2023

ylacombe Oct 2, 2023

ArthurZucker Oct 2, 2023

Vaibhavs10 commented Oct 11, 2023

ylacombe commented Oct 11, 2023

sanchit-gandhi left a comment

sanchit-gandhi Oct 12, 2023

sanchit-gandhi Oct 12, 2023

sanchit-gandhi commented Oct 12, 2023

ylacombe left a comment

ArthurZucker left a comment

ArthurZucker Oct 16, 2023

ylacombe Oct 16, 2023

ylacombe commented Oct 16, 2023

sanchit-gandhi commented Oct 23, 2023

ylacombe commented Nov 2, 2023

ylacombe commented Nov 2, 2023

amyeroberts commented Nov 2, 2023 •

edited

Loading

Enrich TTS pipeline parameters naming #26473

Enrich TTS pipeline parameters naming #26473

Conversation

ylacombe commented Sep 28, 2023

What does this PR do?

Before submitting

Who can review?

HuggingFaceDocBuilderDev commented Sep 28, 2023 • edited Loading

sanchit-gandhi Sep 29, 2023

Choose a reason for hiding this comment

ylacombe Oct 2, 2023

Choose a reason for hiding this comment

ArthurZucker Oct 2, 2023

Choose a reason for hiding this comment

Vaibhavs10 commented Oct 11, 2023

ylacombe commented Oct 11, 2023

sanchit-gandhi left a comment

Choose a reason for hiding this comment

sanchit-gandhi Oct 12, 2023

Choose a reason for hiding this comment

sanchit-gandhi Oct 12, 2023

Choose a reason for hiding this comment

sanchit-gandhi commented Oct 12, 2023

ylacombe left a comment

Choose a reason for hiding this comment

ArthurZucker left a comment

Choose a reason for hiding this comment

ArthurZucker Oct 16, 2023

Choose a reason for hiding this comment

ylacombe Oct 16, 2023

Choose a reason for hiding this comment

ylacombe commented Oct 16, 2023

sanchit-gandhi commented Oct 23, 2023

ylacombe commented Nov 2, 2023

ylacombe commented Nov 2, 2023

amyeroberts commented Nov 2, 2023 • edited Loading

HuggingFaceDocBuilderDev commented Sep 28, 2023 •

edited

Loading

amyeroberts commented Nov 2, 2023 •

edited

Loading