VideoLingo: Connecting the World, Frame by Frame

🌟 Overview

VideoLingo is an all-in-one video translation, localization, and dubbing tool aimed at generating Netflix-quality subtitles. It eliminates stiff machine translations and multi-line subtitles while adding high-quality dubbing, enabling global knowledge sharing across language barriers. With an intuitive Streamlit interface, you can transform a video link into a localized video with high-quality bilingual subtitles and dubbing in just a few clicks.

Key features:

🎥 YouTube video download via yt-dlp
🎙️ Word-level subtitle recognition with WhisperX
📝 NLP and GPT-based subtitle segmentation
📚 GPT-generated terminology for coherent translation
🔄 2-step translation process rivaling professional quality
✅ Netflix-standard single-line subtitles only
🗣️ Dubbing alignment (e.g., GPT-SoVITS)
🚀 One-click startup and output in Streamlit
📝 Detailed logging with progress resumption
🌐 Comprehensive multi-language support

Difference from similar projects: Single-line subtitles only, superior translation quality

🎥 Demo

Russian Translation

ru_demo.mp4

GPT-SoVITS

sovits.mp4

OAITTS

OAITTS.mp4

Language Support:

Current input language support and examples:

Input Language	Support Level	Translation Demo
English	🤩	English to Chinese
Russian	😊	Russian to Chinese
French	🤩	French to Japanese
German	🤩	German to Chinese
Italian	🤩	Italian to Chinese
Spanish	🤩	Spanish to Chinese
Japanese	😐	Japanese to Chinese
Chinese*	🤩	Chinese to English

*Chinese requires separate configuration of the whisperX model, only applicable for local source code installation. See the installation documentation for the configuration process, and be sure to specify the transcription language as zh in the webpage sidebar

Translation language support depends on the capabilities of the large language model used, while dubbing language depends on the chosen TTS method.

🚀 Quick Start

Online Experience

Experience VideoLingo quickly in Colab in just 5 minutes:

Local Installation

VideoLingo offers two local installation methods: One-click Simple Package and Source Code Installation. Please refer to the installation documentation: English | 简体中文

Docker Installation

VideoLingo provides a Dockerfile for Docker installation. Please refer to the installation documentation: English | 简体中文

🏭 Batch Mode

Usage instructions: English | 简体中文

⚠️ Current Limitations

UVR5 voice separation is resource-intensive and processes slowly. It's recommended to use this feature only on devices with more than 16GB of RAM and 8GB of VRAM. Note: For videos with loud BGM, not performing voice separation before whisper may cause word-level subtitle adhesion, resulting in errors in the final alignment step.
The quality of dubbing may not be perfect due to differences in language structure and morpheme information density between source and target languages. For best results, choose TTS with similar speech rates based on the original video's speed and content characteristics. The best practice is to train the original video's voice using GPT-SoVITS, then use "Mode 3: Use every reference audio" for dubbing. This ensures maximum consistency in voice, speech rate, and tone. See the demo for effects.
Multilingual video transcription recognition will only retain the main language. This is because whisperX uses a specialized model for a single language when forcibly aligning word-level subtitles, deleting unrecognized languages.
Multi-character separate dubbing is currently unavailable. While whisperX has VAD potential, specific development is needed, and this feature is not yet implemented.

🚗 Roadmap

VAD to distinguish speakers, multi-character dubbing
Customizable translation styles
User terminology glossary
Provide commercial services
Lip sync for dubbed videos

📄 License

This project is licensed under the Apache 2.0 License. When using this project, please follow these rules:

When publishing works, it is recommended (not mandatory) to credit VideoLingo for subtitle generation.
Follow the terms of the large language models and TTS used for proper attribution.
If you copy the code, please include the full copy of the Apache 2.0 License.

We sincerely thank the following open-source projects for their contributions, which provided important support for the development of VideoLingo:

📬 Contact Us

Join our Discord: https://discord.gg/JrBKhSDb
Submit Issues or Pull Requests on GitHub
Follow me on Twitter: @Huanshere
Visit the official website: videolingo.io
Email me at: [email protected]

⭐ Star History

If you find VideoLingo helpful, please give us a ⭐️!

Name		Name	Last commit message	Last commit date
Latest commit History 669 Commits
.streamlit		.streamlit
batch		batch
core		core
docs		docs
i18n		i18n
st_components		st_components
third_party		third_party
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
OneKeyStart.bat		OneKeyStart.bat
README.md		README.md
VideoLingo_colab.ipynb		VideoLingo_colab.ipynb
config.yaml		config.yaml
install.py		install.py
requirements.txt		requirements.txt
st.py		st.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VideoLingo: Connecting the World, Frame by Frame

🌟 Overview