ASR Subtitles

Introduction

This project uses the Automatic Speech Recognition (ASR) model OpenAI Whisper to create subtitles for talks and similar video's. Whisper correctly transcribes most words and sentences with the base model, but the Word Error Rate (WER) can be decreased with the larger (and more resource hungry) models.

This tool can potentially take much of the required workload out of transcribing subtitles, however, manual correction MUST be performed at a later time to ensure of precision.

An example of wrong word recognition with the base model, is the word 'batch' can be recognized as 'patch' in some cases. While this is the case for the base and tiny model, it is not necessarily an issue in the larger models. Read the OpenAI Whisper model card and the paper Robust Speech Recognition via Large-Scale Weak Supervision by Radford et al. for more information on transcription precision.

Fetch a talk from media.ccc.de to test the program out.

Performance

Performance have been tested on the 18 minute talk "This years badge" by Thomas Flummer from Bornhack 2022.

Processor	Model	Transcribe duration
3 GHz CPU	base model	15 min 12 sec
Nvidia Tesla M60, 1 core	base model	1 min 36 sec
Nvidia Tesla M60, 1 core	medium model	7 min 11 sec
Nvidia RTX 3090	tiny model	21 sec
Nvidia RTX 3090	base model	35 sec
Nvidia RTX 3090	small model	1 min 4 sec
Nvidia RTX 3090	medium model	2 min 3 sec
Nvidia RTX 3090	large model	2 min 53 sec
Nvidia RTX A4000	tiny model	47 sec

Get started

Install dependencies

As noted in the OpenAI Whisper repository, the library should work with Python 3.7 and later.

Required dependencies are ffmpeg, a Python 3 version with the virtual environment package, python dependencies listed in requirements.txt file as well as Nvidia drivers for your GPU.

Ubuntu 20.04 LTS

sudo apt update
sudo apt upgrade -y
sudo apt install ffmpeg python3.9 python3.9-venv

Nvidia drivers

Install GPU drivers. In case OpenAI Whisper cannot find drivers, it will use the CPU on the machine to transcribe, which takes significantly longer.

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
sudo apt install nvidia-common ubuntu-drivers-common -y
sudo ubuntu-drivers devices
sudo ubuntu-drivers autoinstall

Debian 12 "Bookworm"

Install the following packages

sudo apt update
sudo apt install linux-headers-amd64 ffmpeg python3.11 python3.11-venv

See the following wiki article for Nvidia driver installation instructions.

Arch Linux

Install the following packages

sudo pacman -Sy ffmpeg python python-virtualenv

More information on the Arch wiki about Nvidia drivers.

Python Environment Setup

Create a virtual environment and install dependencies. Look into the OpenAI Whisper setup if you encounter dependency errors.

python3 -m venv venv
source venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
pip install git+https://github.com/openai/whisper.git

Run

Enter virtual environment and run

source venv/bin/activate
python app.py --video <video_file> --model <whipser model>

Parameters:

usage: app.py [-h] [-v VIDEO] [-l] [-m WHISPER_MODEL]

Create subtitle file from video.

options:
  -h, --help            show this help message and exit
  -v VIDEO, --video VIDEO
                        Video file to be processed
  -l, --language        Manually set transcription language
  -m WHISPER_MODEL, --model WHISPER_MODEL
                        Set OpenAI Whisper model

The sample below runs ASR subtitles on a directory of videos with the large OpenAI Whisper model, and times it as well:

time python app.py --video videos/ --model large

The program outputs a SRT file named <video_file>.srt in the same directory as the video file. You can use VLC or other media players to play the video and add the subtitles.

Exit virtual environment

deactivate

Miscellaneous

Update Whisper library

pip install --upgrade --no-deps --force-reinstall git+https://github.com/openai/whisper.git

Thanks to

OpenAI Whisper for their wonderful models
Much inspiration have been drawn from Whisper-ASR-youtube-subtitles

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.gitignore		.gitignore
LICENSE.md		LICENSE.md
Makefile		Makefile
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ASR Subtitles

Introduction

Performance

Get started

Install dependencies

Ubuntu 20.04 LTS

Debian 12 "Bookworm"

Arch Linux

Python Environment Setup

Run

Miscellaneous

Thanks to

About

Languages

License

Hafpaf/ASR_subtitles

Folders and files

Latest commit

History

Repository files navigation

ASR Subtitles

Introduction

Performance

Get started

Install dependencies

Ubuntu 20.04 LTS

Debian 12 "Bookworm"

Arch Linux

Python Environment Setup

Run

Miscellaneous

Thanks to

About

Topics

Resources

License

Stars

Watchers

Forks

Languages