Skip to content

zifeng-radxa/whisper-TPU_py

 
 

Repository files navigation

whisper-TPU python

Remark!!!

Trying this project, please use code on release branch and ask owner to get released bmodel.

Environment

The codebase is expected to be compatible with Python 3.8-3.11 and recent PyTorch versions. The codebase also depends on a few Python packages, most notably OpenAI's tiktoken for their fast tokenizer implementation. You can setup the environment with the following command:

pip install requirements.txt

It also requires the command-line tool ffmpeg to be installed on your system, which is available from most package managers:

# if you use a conda environment
conda install ffmpeg
 
# on Ubuntu or Debian
sudo apt update && sudo apt install ffmpeg 

You can install bmwhisper as follow:

python setup.py install

Command-line usage

CPU mode

The following command will transcribe speech in audio files using cpu, using the 'base' model:

bmwhisper demo.wav --model base

TPU mode

To use TPU, firstly, you need to generate the onnx model:

./gen_onnx.sh --model base

# if you want to use kvcache
./gen_onnx.sh --model base --use_kvcache

Then, transform onnx model to bmodel:

./gen_bmodel.sh --model base

# if you want to use kvcache
./gen_bmodel.sh --model base --use_kvcache

# if you want to compare the data when transforming and deploying
./gen_bmodel.sh --model base --compare

Use --inference button to allow TPU inference mode:

bmwhisper demo.wav --model base --inference

About

A whisper repo for TPU

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 96.7%
  • Shell 3.3%