Transcribe an audio file to Premiere Pro layers
A GUI tool that uses OpenAI's Whisper to transcribe text from an audio/video file, into a Premiere Pro sequence to automate the creation of subtitles. Mainly for adding quick subtitles to action-packed videos, by making segments of a small word count.
Outputs a .xml
file which is a sequence containing text layers (Essential Graphics) that can be imported into your Premiere Pro project.
Uses stable-ts
regrouping functions to split the result into small configurable segments.
git clone https://github.com/JorianWoltjer/AutoCaptions.git && cd AutoCaptions
python -m pip install -r requirements.txt
Make sure to install the GPU enabled version of torch
to make Whisper a lot faster:
python -m pip uninstall torch
python -m pip cache purge
python -m pip install torch -f https://download.pytorch.org/whl/torch_stable.html
An external dependency for Whisper that needs to be installed:
Install Chocolatey, then run the following command:
choco install ffmpeg
sudo apt update && sudo apt install ffmpeg
Simply create a shortcut to start.bat
$ python main.py
Start the batch script, and select a file as input. Then some configuration is available and you can transcribe the audio:
The resulting XML file can then be imported into a Premiere project, where you can use and edit the text layers it created:
Tip: To apply a style to all the text layers, you can create an Essential Graphics preset. Just do your settings on one of the layers, and then save it as a preset. Then you can drag the preset from your Project window to all the layers you select.
For animation keyframes you want to save an Animation Preset, which you can do by right-clicking on your created effect with keyframes and saving the Preset. Then you can drag it from your Effects window under Presets to all the layers you select.