Transcribe
Get started with Amazon Transcribe on LocalStack
Introduction
Transcribe is a service provided by Amazon Web Services (AWS) that offers automatic speech recognition (ASR) capabilities. It enables developers to convert spoken language into written text, making it valuable for a wide range of applications, from transcription services to voice analytics.
LocalStack supports Transcribe via the Community offering, allowing you to use the Transcribe APIs for offline speech-to-text jobs in your local environment. The supported APIs are available on our API Coverage Page, which provides information on the extent of Transcribe integration with LocalStack.
Note
LocalStack’s Transcribe relies on the offline speech-to-text service called Vosk. Therefore, LocalStack requires an internet connection during the initial creation of a transcription job for a specific language. This initial connection is required to download and cache the language model.
Once the language model is cached, subsequent transcriptions for the same language can be performed offline. These language models typically have a size of around 50 MiB, and they are saved to the cache directory (for more details, refer to the Filesystem Layout section).
Getting Started
This guide is designed for users new to Transcribe and assumes basic knowledge of the AWS CLI and our awslocal
wrapper script.
Start your LocalStack container using your preferred method. We will demonstrate how to create a transcription job and view the transcript in an S3 bucket using the AWS CLI.
Note
This service offers limited support for aarch64/Apple Silicon platforms.
If you encounter errors like cannot load library *.so
, we recommend trying the AMD64 build of LocalStack as an alternative solution. Run the following command to pull the AMD64 build of LocalStack:
$ docker pull localstack/localstack:2.0.0 --platform amd64
Create an S3 bucket
You can create an S3 bucket using the mb
command. Run the following command to create a bucket named foo
to upload a sample audio file named example.wav
:
$ awslocal s3 mb s3://foo
$ awslocal s3 cp ~/example.wav s3://foo/example.wav
Create a transcription job
You can create a transcription job using the StartTranscriptionJob
API. Run the following command to create a transcription job named example
for the audio file example.wav
:
$ awslocal transcribe start-transcription-job \
--transcription-job-name example \
--media MediaFileUri=s3://foo/example.wav \
@@ -345,7 +345,7 @@
$ jq .results.transcripts[0].transcript 7844aaa5.json
"it is just a question of getting rid of the illusion that we are separate from nature"
-
Examples
The following code snippets and sample applications provide practical examples of how to use Transcribe in LocalStack for various use cases:
Limitations
Currently, our Transcribe emulation offers only supported formats and languages.
The following input media formats are supported:
- Adaptive Multi-Rate (AMR)
- Free Lossless Audio Codec (FLAC)
- MPEG-1 Audio Layer-3 (MP3)
- MPEG-4 Part 14 (MP4)
- OGG
- Matroska files (MKV)
- Waveform Audio File Format (WAV)
Supported Languages
The following langauges and dialects are supported:
Language | Language Code |
---|
German | de-DE |
English, British | en-GB |
English, Indian | en-IN |
English, US | en-US |
Spanish | es-ES |
Farsi | fa-IR |
French | fr-FR |
Hindi | hi-IN |
Italian | it-IT |
Japan | ja-JP |
Dutch | nl-NL |
Portuguese | pt-BR |
Russian | ru-RU |
Turkish | tr-TR |
Vietnamese | vi-VN |
Chinese | zh-CN |