Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Whisper] 🚨 Fix whisper decoding 🚨 #34135

Merged
merged 75 commits into from
Dec 18, 2024
Merged
Show file tree
Hide file tree
Changes from 55 commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
4c7be8c
do not remove decoder_input_ids for the first segment
eustlb Oct 13, 2024
805c688
do not remove eos token in generate_with_fallback
eustlb Oct 13, 2024
271014f
when removing padding tokens, do not remove eos token
eustlb Oct 13, 2024
d9a91b3
remove eos token in generate (and not in generate_with_fallback!)
eustlb Oct 13, 2024
b9d0ec1
reconciliate short-from/ long-form behavior
eustlb Oct 13, 2024
f9d5fdd
correct avg_logprobs calculation
eustlb Oct 13, 2024
f111aa3
handle eos token in segments
eustlb Oct 13, 2024
34d690a
handle decoder_input_ids and eos token in _prepare_decoder_input_ids
eustlb Oct 13, 2024
16f768b
fix incorrect time precision
eustlb Oct 13, 2024
cf18e0a
always remove eos token
eustlb Oct 18, 2024
ec7cd58
always remove decoder_input_ids
eustlb Oct 18, 2024
1530930
no need to handle decoder_inputs_ids and eos token
eustlb Oct 18, 2024
67865dd
no need to remove decoder_input_ids
eustlb Oct 18, 2024
eb107d9
no need to handle eos token
eustlb Oct 18, 2024
7881928
fix num_beams in _retrieve_logit_processors
eustlb Oct 24, 2024
031ace6
remove todo unconsistency
eustlb Nov 4, 2024
eaaeec6
no need to add eos token
eustlb Nov 4, 2024
cdd5144
last_timestamp_pos should indeed be timestamp token pos
eustlb Nov 19, 2024
544f21b
patch generate to enable compatibility with GenerationTesterMixin tests
eustlb Nov 20, 2024
70ffdb3
adapt test_generate_continue_from_past_key_values
eustlb Nov 20, 2024
79347e9
adapt test_prompt_lookup_decoding_matches_greedy_search
eustlb Nov 20, 2024
c01ef12
Merge branch 'main' into fix-whisper-decoding
eustlb Nov 22, 2024
d03f87c
Merge branch 'main' into fix-whisper-decoding
eustlb Nov 22, 2024
1387919
adapt generic GenerationMixin tests to whisper's generate
eustlb Nov 27, 2024
ad262c6
fix speculative decoding
eustlb Nov 27, 2024
a4c2b44
fix
eustlb Nov 27, 2024
17877f3
Merge branch 'main' into fix-whisper-decoding
eustlb Nov 27, 2024
7b2046a
Merge branch 'main' into fix-whisper-decoding
eustlb Nov 27, 2024
c9a209f
Merge branch 'main' into fix-whisper-decoding
eustlb Dec 5, 2024
34d5257
[run-slow] whisper
Dec 5, 2024
c760d09
change HF_HUB_TOKEN for require_read_token
eustlb Dec 5, 2024
e7c4b85
[run-slow] whisper
eustlb Dec 5, 2024
40bcdcc
Merge branch 'main' into fix-whisper-decoding
eustlb Dec 5, 2024
783c2f1
prioritize kwargs over generation_config
eustlb Dec 5, 2024
88d2d38
remove unnecessary args
eustlb Dec 5, 2024
3a8f739
[run-slow] whisper
eustlb Dec 5, 2024
93cf1b7
update tests
eustlb Dec 5, 2024
2afb05c
Merge branch 'main' into fix-whisper-decoding
eustlb Dec 5, 2024
ecbe56b
[run-slow] whisper
eustlb Dec 5, 2024
c9b36c3
add comment
eustlb Dec 6, 2024
14c280a
update test
eustlb Dec 6, 2024
78ed67e
Merge branch 'main' into fix-whisper-decoding
eustlb Dec 6, 2024
34adc99
[run-slow] whisper
eustlb Dec 6, 2024
58f0526
update test + revert require_read_token
eustlb Dec 6, 2024
367006d
docstring updates
eustlb Dec 6, 2024
2f19263
revert tokenizer decode args change
eustlb Dec 8, 2024
9c6fe55
Merge branch 'main' into fix-whisper-decoding
eustlb Dec 8, 2024
3ebd12f
do not use a patch + docstring updates
eustlb Dec 10, 2024
b9ba08a
Merge branch 'main' into fix-whisper-decoding
eustlb Dec 10, 2024
24d9399
[run-slow] whisper
eustlb Dec 10, 2024
9fe3011
make
eustlb Dec 10, 2024
3aae6b8
[run-slow] whisper
eustlb Dec 10, 2024
0ea240f
add a flag to force unique call to generate
eustlb Dec 11, 2024
0960a52
test update
eustlb Dec 11, 2024
0870ac7
[run-slow] whisper
eustlb Dec 11, 2024
00d37e8
add force_unique_generate_call arg
eustlb Dec 11, 2024
cf75ea3
do not use a patch
eustlb Dec 11, 2024
2cb638d
correct the timestamps for the pad tokens
eustlb Dec 11, 2024
66290cb
docstring update
eustlb Dec 11, 2024
b41e2c1
docstring update
eustlb Dec 11, 2024
221bae1
docstring update
eustlb Dec 11, 2024
98dc7f1
Merge branch 'main' into fix-whisper-decoding
eustlb Dec 11, 2024
0b6687c
upodate TF tests
eustlb Dec 11, 2024
ab910f7
add require_read_token
eustlb Dec 11, 2024
692bf14
[run-slow] whisper
eustlb Dec 11, 2024
a15aa4a
test reset dynamo
eustlb Dec 12, 2024
0011cec
Merge branch 'main' into fix-whisper-decoding
eustlb Dec 12, 2024
41df6ca
[run-slow] whisper
ydshieh Dec 12, 2024
aaecea4
fix
ydshieh Dec 12, 2024
58d0b90
[run-slow] whisper
ydshieh Dec 12, 2024
a1f4e43
avoid iterating twice on current_segments
eustlb Dec 18, 2024
671b079
[run-slow] whisper
eustlb Dec 18, 2024
3d21ed8
Merge branch 'main' into fix-whisper-decoding
eustlb Dec 18, 2024
dc6fbd1
[run-slow] whisper
eustlb Dec 18, 2024
8b7a2e8
Merge branch 'main' into fix-whisper-decoding
eustlb Dec 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Loading