Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive Hypothesis Words in Non-English Audios #206

Open
rk-helper opened this issue Sep 26, 2024 · 1 comment
Open

Excessive Hypothesis Words in Non-English Audios #206

rk-helper opened this issue Sep 26, 2024 · 1 comment
Assignees
Labels
triaged This issue has been looked at and prioritized by a maintainer

Comments

@rk-helper
Copy link

Hello,

I’m experiencing an issue with transcription in non-English streams, particularly in Russian, where the number of hypothesis words generated is disproportionately large compared to confirmed words. The hypothesis words can be up to half of the confirmed words, which significantly impacts the accuracy and readability of the transcription. This issue is not present when transcribing English streams, where hypothesis words are more appropriately balanced.

Environment:

•	Model: whisper-large-v3 turbo 958mb
•	Device: MacBook Pro M3 Max (36GB RAM)

Video of issue in whisperkit: https://youtu.be/JWEHgKwogG8

@atiorh atiorh self-assigned this Oct 16, 2024
@atiorh atiorh added the triaged This issue has been looked at and prioritized by a maintainer label Oct 16, 2024
@atiorh
Copy link
Contributor

atiorh commented Oct 16, 2024

I have replicated this in other languages as well. This requires an algorithmic improvement to the Eager Streaming Mode in order to break out of diverging hypotheses. We are investigating a fix for this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triaged This issue has been looked at and prioritized by a maintainer
Projects
None yet
Development

No branches or pull requests

2 participants