-
Notifications
You must be signed in to change notification settings - Fork 27.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Whisper] TypeError: '<=' not supported between instances of 'NoneType' and 'float' #33552
Comments
Thanks for raising, looks like the below indeed does happen: transformers/src/transformers/models/whisper/tokenization_whisper.py Lines 1058 to 1061 in db72894
since this loops over tokens and the last index + 1 will be out of range:
cc @eustlb @ylacombe wdyt about how the last timestamp should be handled ? |
Any news here? Will it be fixed anytime soon ? Or is there a version where this is not a problem? |
Hey, working on a fix but having trouble consistently reproducing this (sometimes breaks sometimes not) 🤔 Are you experiencing the same? @aklacar1 @felipehertzer ? |
@itazap Yes I am, however it happens on one of my videos, but not on other. I have no idea why one works and other does not. The one that does not work is almost 20 minutes long. NOTE: To make sure I am giving you correct info, I am rerunning it now. Just waiting for Docker build to finish. |
This time I used: torch==2.0.0 Last time I believe I had 2.4.x version of Torch, this time its 2.0.0. However, I get same results File "/function/stable_whisper/whisper_word_level/hf_whisper.py", line 236, in transcribe |
Hi @itazap, Thank you for looking into the issue. I am able to reproduce it using the following audio file and stable-ts code: Audio: https://file.io/wi9gbaf1GMvt |
does #33625 fix your issue ? 🤗 |
@felipehertzer apologies for the delay but I believe the link has expired, can you please reshare the file? 🙏 |
Hi @itazap I reuploaded the file https://drive.google.com/file/d/1oqqnVygU7-8pviRS1CmZ6R86wqv68Uf5/view?usp=sharing Thanks. |
@felipehertzer Thanks! Okay I am not super familiar with the whisper model but I think it has to do with stable-ts not adding a special token to end the text, but you can try this branch: #33625 and it should address the fix for now! 😊 |
Hi @itazap, Sorry for the delay, I just tested it and I can confirm that it have fixed the issue. Thank you. |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
Reopening until a solution is merged into main |
@felipehertzer do you still have the audio snippet ? the link has expired |
Hey @eustlb here is the link updated. Thanks. https://drive.google.com/file/d/1BNUV7K8XMYCRC-YE_6QJ4PpmktgTIuNS/view?usp=sharing |
System Info
transformers
version: 4.44.2Who can help?
@kamilakesbi @ArthurZucker @itazap
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Hi, I am attempting to transcribe several audio files; however, the process intermittently encounters an exception with some of the files. The transcription works successfully in approximately 90% of the cases, but certain files trigger this exception unexpectedly. I am attaching one of the audio files that generates this exception for your review. Thank you.
1 Install Stable TS
pip install stable-ts
2 Run the code:
Audio sample: https://filebin.net/hivqswoer298m65m
Than I receive the follow exception:
Expected behavior
To be able to transcibe the audio files without this exception.
The text was updated successfully, but these errors were encountered: