[Whisper] TypeError: '<=' not supported between instances of 'NoneType' and 'float' #33552

felipehertzer · 2024-09-18T04:49:37Z

System Info

transformers version: 4.44.2
Platform: macOS-15.0-arm64-arm-64bit
Python version: 3.12.6
Huggingface_hub version: 0.24.7
Safetensors version: 0.4.5
Accelerate version: 0.34.2
Accelerate config: not found
PyTorch version (GPU?): 2.6.0.dev20240916 (False)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using distributed or parallel set-up in script?: No

Who can help?

@kamilakesbi @ArthurZucker @itazap

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

Hi, I am attempting to transcribe several audio files; however, the process intermittently encounters an exception with some of the files. The transcription works successfully in approximately 90% of the cases, but certain files trigger this exception unexpectedly. I am attaching one of the audio files that generates this exception for your review. Thank you.

I was able replicate it on a MacOS on CPU and Linux on CUDA.

1 Install Stable TS
pip install stable-ts

2 Run the code:

import stable_whisper

model = stable_whisper.load_hf_whisper('medium')
result = model.transcribe(
    audio = 'radio_18596_1726554951_1726554981.mp3',
)
print(result.text)

Audio sample: https://filebin.net/hivqswoer298m65m

Than I receive the follow exception:

Traceback (most recent call last):
  File "/tests/test.py", line 4, in <module>
    result = model.transcribe(
             ^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.12/site-packages/stable_whisper/whisper_word_level/hf_whisper.py", line 236, in transcribe
    return transcribe_any(
           ^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.12/site-packages/stable_whisper/non_whisper.py", line 342, in transcribe_any
    result = inference_func(**inference_kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.12/site-packages/stable_whisper/whisper_word_level/hf_whisper.py", line 116, in _inner_transcribe
    output = self._pipe(audio, **pipe_kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.12/site-packages/transformers/pipelines/automatic_speech_recognition.py", line 284, in __call__
    return super().__call__(inputs, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.12/site-packages/transformers/pipelines/base.py", line 1255, in __call__
    return next(
           ^^^^^
  File "/.venv/lib/python3.12/site-packages/transformers/pipelines/pt_utils.py", line 125, in __next__
    processed = self.infer(item, **self.params)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.12/site-packages/transformers/pipelines/automatic_speech_recognition.py", line 587, in postprocess
    text, optional = self.tokenizer._decode_asr(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.12/site-packages/transformers/models/whisper/tokenization_whisper.py", line 835, in _decode_asr
    return _decode_asr(
           ^^^^^^^^^^^^
  File "/.venv/lib/python3.12/site-packages/transformers/models/whisper/tokenization_whisper.py", line 1086, in _decode_asr
    resolved_tokens, resolved_token_timestamps = _find_longest_common_sequence(
                                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.12/site-packages/transformers/models/whisper/tokenization_whisper.py", line 1193, in _find_longest_common_sequence
    matches = sum(
              ^^^^
  File "/.venv/lib/python3.12/site-packages/transformers/models/whisper/tokenization_whisper.py", line 1198, in <genexpr>
    and left_token_timestamp_sequence[left_start + idx]
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: '<=' not supported between instances of 'NoneType' and 'float'

Expected behavior

To be able to transcibe the audio files without this exception.

The text was updated successfully, but these errors were encountered:

itazap · 2024-09-18T12:48:16Z

Thanks for raising, looks like the below indeed does happen:

transformers/src/transformers/models/whisper/tokenization_whisper.py

Lines 1058 to 1061 in db72894

    
           if i + 1 < len(token_timestamps): 
        
               end_time = round(token_timestamps[i + 1] + time_offset, 2) 
        
           else: 
        
               end_time = None  # should never happen

since this loops over tokens and the last index + 1 will be out of range:

transformers/src/transformers/models/whisper/tokenization_whisper.py

Line 971 in db72894

for i, token in enumerate(token_ids):

cc @eustlb @ylacombe wdyt about how the last timestamp should be handled ?

moodpanda · 2024-09-24T05:18:45Z

Im experiencing this as well on stable whisper

aklacar1 · 2024-09-27T08:17:51Z

Any news here? Will it be fixed anytime soon ? Or is there a version where this is not a problem?

itazap · 2024-09-30T14:28:08Z

Hey, working on a fix but having trouble consistently reproducing this (sometimes breaks sometimes not) 🤔 Are you experiencing the same? @aklacar1 @felipehertzer ?

aklacar1 · 2024-09-30T15:15:35Z

@itazap Yes I am, however it happens on one of my videos, but not on other. I have no idea why one works and other does not. The one that does not work is almost 20 minutes long.

NOTE: To make sure I am giving you correct info, I am rerunning it now. Just waiting for Docker build to finish.

aklacar1 · 2024-09-30T15:42:14Z

This time I used:

torch==2.0.0
torchvision==0.15
torchaudio==2.0.1
transformers=4.45.1
stable-ts[hf]==2.17.4

Last time I believe I had 2.4.x version of Torch, this time its 2.0.0. However, I get same results

File "/function/stable_whisper/whisper_word_level/hf_whisper.py", line 236, in transcribe
return transcribe_any(
File "/function/stable_whisper/non_whisper.py", line 342, in transcribe_any
result = inference_func(**inference_kwargs)
File "/function/stable_whisper/whisper_word_level/hf_whisper.py", line 116, in _inner_transcribe
output = self._pipe(audio, **pipe_kwargs)
File "/function/transformers/pipelines/automatic_speech_recognition.py", line 284, in call
return super().call(inputs, **kwargs)
File "/function/transformers/pipelines/base.py", line 1260, in call
return next(
File "/function/transformers/pipelines/pt_utils.py", line 125, in next
processed = self.infer(item, **self.params)
File "/function/transformers/pipelines/automatic_speech_recognition.py", line 598, in postprocess
text, optional = self.tokenizer._decode_asr(
File "/function/transformers/models/whisper/tokenization_whisper.py", line 835, in _decode_asr
return _decode_asr(
File "/function/transformers/models/whisper/tokenization_whisper.py", line 1034, in _decode_asr
resolved_tokens, resolved_token_timestamps = _find_longest_common_sequence(
File "/function/transformers/models/whisper/tokenization_whisper.py", line 1193, in _find_longest_common_sequence
matches = sum(
File "/function/transformers/models/whisper/tokenization_whisper.py", line 1198, in
and left_token_timestamp_sequence[left_start + idx]
TypeError: '<=' not supported between instances of 'NoneType' and 'float'

felipehertzer · 2024-10-01T06:35:00Z

Hi @itazap,

Thank you for looking into the issue. I am able to reproduce it using the following audio file and stable-ts code:

Audio: https://file.io/wi9gbaf1GMvt

ArthurZucker · 2024-10-03T13:20:08Z

does #33625 fix your issue ? 🤗

itazap · 2024-10-05T10:53:11Z

@felipehertzer apologies for the delay but I believe the link has expired, can you please reshare the file? 🙏

felipehertzer · 2024-10-05T11:12:14Z

Hi @itazap I reuploaded the file https://drive.google.com/file/d/1oqqnVygU7-8pviRS1CmZ6R86wqv68Uf5/view?usp=sharing

Thanks.

itazap · 2024-10-09T11:11:22Z

@felipehertzer Thanks! Okay I am not super familiar with the whisper model but I think it has to do with stable-ts not adding a special token to end the text, but you can try this branch: #33625 and it should address the fix for now! 😊

felipehertzer · 2024-10-14T04:36:39Z

Hi @itazap, Sorry for the delay, I just tested it and I can confirm that it have fixed the issue. Thank you.

github-actions · 2024-11-07T08:05:32Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

eustlb · 2024-12-18T15:59:03Z

Reopening until a solution is merged into main

eustlb · 2024-12-18T16:04:14Z

@felipehertzer do you still have the audio snippet ? the link has expired

felipehertzer · 2024-12-18T22:11:50Z

Hey @eustlb here is the link updated. Thanks.

https://drive.google.com/file/d/1BNUV7K8XMYCRC-YE_6QJ4PpmktgTIuNS/view?usp=sharing

felipehertzer added the bug label Sep 18, 2024

LysandreJik added Core: Tokenization Internals of the library; Tokenization. Audio labels Sep 21, 2024

itazap mentioned this issue Sep 27, 2024

#33512 handle last element out of range error #33625

Open

github-actions bot closed this as completed Nov 15, 2024

eustlb reopened this Dec 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Whisper] TypeError: '<=' not supported between instances of 'NoneType' and 'float' #33552

[Whisper] TypeError: '<=' not supported between instances of 'NoneType' and 'float' #33552

felipehertzer commented Sep 18, 2024

itazap commented Sep 18, 2024 •

edited

Loading

moodpanda commented Sep 24, 2024

aklacar1 commented Sep 27, 2024

itazap commented Sep 30, 2024 •

edited

Loading

aklacar1 commented Sep 30, 2024 •

edited

Loading

aklacar1 commented Sep 30, 2024

felipehertzer commented Oct 1, 2024

ArthurZucker commented Oct 3, 2024

itazap commented Oct 5, 2024

felipehertzer commented Oct 5, 2024

itazap commented Oct 9, 2024

felipehertzer commented Oct 14, 2024

github-actions bot commented Nov 7, 2024

eustlb commented Dec 18, 2024

eustlb commented Dec 18, 2024

felipehertzer commented Dec 18, 2024

[Whisper] TypeError: '<=' not supported between instances of 'NoneType' and 'float' #33552

[Whisper] TypeError: '<=' not supported between instances of 'NoneType' and 'float' #33552

Comments

felipehertzer commented Sep 18, 2024

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

itazap commented Sep 18, 2024 • edited Loading

moodpanda commented Sep 24, 2024

aklacar1 commented Sep 27, 2024

itazap commented Sep 30, 2024 • edited Loading

aklacar1 commented Sep 30, 2024 • edited Loading

aklacar1 commented Sep 30, 2024

felipehertzer commented Oct 1, 2024

ArthurZucker commented Oct 3, 2024

itazap commented Oct 5, 2024

felipehertzer commented Oct 5, 2024

itazap commented Oct 9, 2024

felipehertzer commented Oct 14, 2024

github-actions bot commented Nov 7, 2024

eustlb commented Dec 18, 2024

eustlb commented Dec 18, 2024

felipehertzer commented Dec 18, 2024

itazap commented Sep 18, 2024 •

edited

Loading

itazap commented Sep 30, 2024 •

edited

Loading

aklacar1 commented Sep 30, 2024 •

edited

Loading