Releases · NotYuSheng/Transcribe-Translate

This release, v1.1.0, introduces two major improvements: the ability to handle concurrent transcription and translation requests, and the removal of local file storage during the processing of uploaded media. These changes provide better performance and flexibility while maintaining the app’s core features for transcribing and translating media files.

Key Features

Concurrency

The backend now supports concurrent processing of multiple requests, allowing the system to handle multiple transcriptions or translations simultaneously. This ensures faster response times and better scalability for users uploading multiple files or for high-traffic environments.

In-Memory File Processing

Uploaded media files are now processed directly in memory, without being saved to the local filesystem. This improves processing speed and reduces disk usage, making the application more efficient and suitable for environments with limited storage.

Download

Source Code

This release improves the overall performance and efficiency of the Transcribe & Translate application, making it faster and more scalable.

What's Changed

Dev by @NotYuSheng in #10

Full Changelog: v1.0.0...v1.10

Contributors

NotYuSheng

Assets 2

12 Oct 13:54

NotYuSheng

v1.3.1

92d9f14

v1.3.1

Full Changelog: v1.3.0...v1.3.1

Assets 2

12 Oct 13:23

NotYuSheng

v1.3.0

888a432

v1.3.0

What's Changed

Nginx by @NotYuSheng in #17
Bump send and express in /frontend by @dependabot in #16
Bump axios from 0.21.4 to 0.28.0 in /frontend by @dependabot in #15
Bump rollup from 2.79.1 to 2.79.2 in /frontend by @dependabot in #14

Full Changelog: v1.2.0...v1.3.0

Contributors

dependabot and NotYuSheng

Assets 2

12 Sep 08:06

NotYuSheng

v1.2.0

b73771e

v1.2.0

What's Changed

Bump body-parser and express in /frontend by @dependabot in #11
Client fix2 by @NotYuSheng in #13

New Contributors

@dependabot made their first contribution in #11

Full Changelog: v1.10...v1.2.0

Contributors

dependabot and NotYuSheng

Assets 2

07 Sep 15:38

NotYuSheng

v1.0.0

9976724

Transcribe & Translate v1.0.0 - Initial Release

Release Date: 7 September 2024

Overview

This is the initial release of the Transcribe & Translate, an open-source project that allows users to:

Transcribe and translate audio and video files.
Detect the language of uploaded media automatically.
Export transcriptions and translations in multiple formats (TXT, JSON, SRT, VTT).
View transcriptions and translations with timestamps.

This release marks v1.0.0, which includes support for handling Whisper models to provide high-quality transcriptions and translations, with an easy-to-use React frontend and FastAPI backend. The application is fully containerized with Docker, making it easy to deploy.

Key Features

Transcription

Supported Media Types: Audio (MP3, WAV) and Video (MP4, MKV, AVI).
Model Selection: Choose from multiple preloaded Whisper models (base, base.en, large) to handle various transcription and translation needs.
Automatic Language Detection: The app automatically detects the language of the media file if not specified by the user.
Timestamps: Transcriptions are displayed with precise start and end timestamps.

Translation

Multilingual Support: Translate media into multiple languages with Whisper's powerful translation capabilities.
Automatic Source Language Detection: If no input language is provided, the app detects the source language automatically.
Side-by-Side View: When translating, view both the original transcription and its translation side by side.

Export Options

Export your transcription or translation into the following formats:
- TXT: Simple plain text format.
- JSON: Structured data with timestamps.
- SRT: Subtitle format with time codes.
- VTT: Web Video Text Tracks format for video captioning.

Loading Indicator

Real-time feedback with loading animations during transcription or translation, along with an elapsed time display once the process completes.

Dynamic Frontend

The frontend dynamically loads available Whisper models from the backend.
Provides media preview (video/audio) directly in the browser.
User-friendly layout with responsive design for different screen sizes.

Dockerized for Easy Deployment

The project is containerized with Docker, allowing for straightforward setup and deployment.
Nginx is used to serve the frontend, and FastAPI for the backend.

Installation & Setup

Prerequisites

Docker and Docker Compose installed.

Steps to Run the Project Locally

Clone the repository:

git clone https://github.com/your-repo/transcribe-translate-app.git
cd transcribe-translate-app

Build and start the Docker containers:
```
docker-compose up --build
```
Access the app in your browser:
```
http://localhost:3000
```
The backend API will run on:
```
http://localhost:8000
```

Whisper Models

The app downloads and uses pre-trained Whisper models (such as base, base.en, and large) for transcription and translation. These models are stored in a Docker volume for persistent storage and efficient use.

Known Issues

Performance on Large Files: The application may take a while to process large media files, especially with the larger Whisper models.
Model Download Time: On the first run, downloading the Whisper models can take a while depending on your internet connection.

Future Enhancements

Additional Language Models: Adding more language models for extended support.
Batch Processing: Implementing the ability to transcribe or translate multiple files at once.
UI Improvements: Further improving the responsiveness and design of the frontend.
More Export Formats: Adding support for additional export formats like CSV and PDF.

Contributors

Ong Yu Sheng - Full Stack Developer

Acknowledgments

This project uses OpenAI's Whisper for transcription and translation services. We extend our gratitude to the open-source community for contributing to these fantastic tools.

Download

Source Code

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

Contributors

Release Date: 11 September 2024

Overview

Key Features

Concurrency

In-Memory File Processing

Download

What's Changed

Contributors

What's Changed

Contributors

What's Changed

New Contributors

Contributors

Release Date: 7 September 2024

Overview

Key Features

Transcription

Translation

Export Options

Loading Indicator

Dynamic Frontend

Dockerized for Easy Deployment

Installation & Setup

Prerequisites

Steps to Run the Project Locally

Whisper Models

Known Issues

Future Enhancements

Contributors

Acknowledgments

Download

Releases: NotYuSheng/Transcribe-Translate

v1.3.2

What's Changed

Contributors

Transcribe & Translate v1.1.0 - Feature Update

Release Date: 11 September 2024

Overview

Key Features

Concurrency

In-Memory File Processing

Download

What's Changed

Contributors

v1.3.1

v1.3.0

What's Changed

Contributors

v1.2.0

What's Changed

New Contributors

Contributors

Transcribe & Translate v1.0.0 - Initial Release

Release Date: 7 September 2024

Overview

Key Features

Transcription

Translation

Export Options

Loading Indicator

Dynamic Frontend

Dockerized for Easy Deployment

Installation & Setup

Prerequisites

Steps to Run the Project Locally

Whisper Models

Known Issues

Future Enhancements

Contributors

Acknowledgments

Download