Whisper Playground

Instantly build speech2text apps in 99 languages using OpenAI's Whisper

Whisper.Playground.mp4

Contribution ideas

Stream audio using web sockets over the current approach of incrementally sending audio chunks
Implement diarization (speaker identification) using pyannote-audio (example)

Setup

Whisper requires the command-line tool ffmpeg and portaudio to be installed on your system, which is available from most package managers:

# on Ubuntu or Debian
sudo apt update && sudo apt install ffmpeg
sudo apt install portaudio19-dev

# on Arch Linux
sudo pacman -S ffmpeg
sudo pacman -S portaudio

# on MacOS using Homebrew (https://brew.sh/)
brew install ffmpeg
brew install portaudio

# on Windows using Chocolatey (https://chocolatey.org/)
choco install ffmpeg

# on Windows using Scoop (https://scoop.sh/)
scoop install ffmpeg

Clone or fork this repository
Install the backend and frontend environmet sh install_playground.sh
Run the backend cd backend && source venv/bin/activate && flask run --port 8000
In a different terminal, run the React frontend cd interface && yarn start

License

This repository and the code and model weights of Whisper are released under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Whisper Playground

Instantly build speech2text apps in 99 languages using OpenAI's Whisper

Contribution ideas

Setup

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

Whisper Playground

Instantly build speech2text apps in 99 languages using OpenAI's Whisper

Contribution ideas

Setup

License