Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add flush_flag to listener to flush the recorded audio immediately without waiting for the phrase to complete. #761

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

sreekanthputta
Copy link

Add flush_flag to listener to flush the recorded audio immediately without waiting for the phrase to complete.

I am working on a real time speech to text application where I am kinda facing an issue.
When the user is done talking, the speech_recognizer waits until the pause_threshold is elapsed. This gets even worse in noisy environments with the dynamic_energy_threshold turned off.

My users don't want to wait as they know that they are done talking. They want to be able to hit enter and reduce the time taken to show them the transcription.

This is just one example of where this could be helpful. I'm sure this feature can be useful in many ways.

I have tried stopper but, it takes a maximum of a second to stop but wont flush the audio.
Also, the stopper wont stop the recorder when the audio is being actively recorded at the times where energy > energy_threshold.

Hence this change.

How to use?

self.flush_flag = [False]
self.recorder.listen_in_background(self.source, self.record_callback, phrase_time_limit=self.record_timeout, flush_flag=self.flush_flag)
        
def onEnter():
    self.flush_flag = [True] # this flag will be reset to false once the audio is flushed.

Please feel free to modify the logic to make it more clean and robust.
TIA.
<|endoftext|>

@ftnext
Copy link
Collaborator

ftnext commented Jul 30, 2024

Thanks.
Is this the same feature request with #757?

@sreekanthputta
Copy link
Author

Not really.
#757 is about streaming buffers as they are recorded.
My change is about the speaker being able to stop the recording immediately after he is done speaking either by clicking transcribe button on my UI or releasing the mic button which held since he started speaking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants