Skip to content

Commit

Permalink
improve transcribe docs
Browse files Browse the repository at this point in the history
  • Loading branch information
HarshCasper committed Sep 27, 2023
1 parent 9f24f80 commit f2f465f
Showing 1 changed file with 58 additions and 38 deletions.
96 changes: 58 additions & 38 deletions content/en/user-guide/aws/transcribe/index.md
Original file line number Diff line number Diff line change
@@ -1,39 +1,49 @@
---
title: "Transcribe"
linkTitle: "Transcribe"
description: >
Get started with Amazon Transcribe on LocalStack
description: Get started with Amazon Transcribe on LocalStack
---

## Introduction

LocalStack supports Transcribe via the Community offering, allowing you to use the Transcribe APIs in your local environment. The supported APIs are available on our [API Coverage Page](https://docs.localstack.cloud/references/coverage/coverage_transcribe/), which provides information on the extent of Transcribe integration with LocalStack.
Transcribe is a service provided by Amazon Web Services (AWS) that offers automatic speech recognition (ASR) capabilities. It enables developers to convert spoken language into written text, making it valuable for a wide range of applications, from transcription services to voice analytics.

LocalStack's Transcribe builds on the offline speech-to-text service [Vosk](https://alphacephei.com/vosk/). Therefore, LocalStack requires an internet connection the first time a transcription job is created for a given language to download and cache the model.
Subsequent transcriptions for the same language can be done offline.
Language models are around 50 MiB each and saved to the cache directory (see [Filesystem Layout]({{< ref "filesystem" >}})).
LocalStack supports Transcribe via the Community offering, allowing you to use the Transcribe APIs for offline speech-to-text jobs in your local environment. The supported APIs are available on our [API Coverage Page](https://docs.localstack.cloud/references/coverage/coverage_transcribe/), which provides information on the extent of Transcribe integration with LocalStack.

{{< alert title="Note">}}
LocalStack's Transcribe relies on the offline speech-to-text service called [Vosk](https://alphacephei.com/vosk/). Therefore, LocalStack requires an internet connection during the initial creation of a transcription job for a specific language. This initial connection is required to download and cache the language model.

Once the language model is cached, subsequent transcriptions for the same language can be performed offline. These language models typically have a size of around 50 MiB, and they are saved to the cache directory (for more details, refer to the [Filesystem Layout]({{< ref "filesystem" >}}) section).
{{< /alert >}}

## Getting Started

This guide is designed for users new to Transcribe and assumes basic knowledge of the AWS CLI and our [`awslocal`](https://github.com/localstack/awscli-local) wrapper script.

Start your LocalStack container using your preferred method. We will demonstrate how to create a transcription job and view the transcript in an S3 bucket using the AWS CLI.

{{< alert title="Note" >}}
This service has limited support for aarch64/Apple Silicon.
This service offers limited support for aarch64/Apple Silicon platforms.

If you encounter errors like `cannot load library *.so`, we recommend trying the AMD64 build of LocalStack as an alternative solution. Run the following command to pull the AMD64 build of LocalStack:

If you encounter `cannot load library *.so` errors, please try the AMD64 build of LocalStack:
{{< command >}}
$ docker pull localstack/localstack:2.0.0 --platform amd64
{{< /command >}}
{{< /alert >}}

### Create an S3 bucket

## Getting Started

Create an S3 bucket and upload the audio file:
You can create an S3 bucket using the [`mb`](https://docs.aws.amazon.com/cli/latest/reference/s3/mb.html) command. Run the following command to create a bucket named `foo` to upload a sample audio file named `example.wav`:

{{< command >}}
$ awslocal s3 mb s3://foo

$ awslocal s3 cp ~/example.wav s3://foo/example.wav
{{< / command >}}

Create the transcription job:
### Create a transcription job

You can create a transcription job using the [`StartTranscriptionJob`](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_StartTranscriptionJob.html) API. Run the following command to create a transcription job named `example` for the audio file `example.wav`:

{{< command >}}
$ awslocal transcribe start-transcription-job \
Expand All @@ -42,10 +52,11 @@ $ awslocal transcribe start-transcription-job \
--language-code en-IN
{{< / command >}}

Jobs can be listed like so:
You can list the transcription jobs using the [`ListTranscriptionJobs`](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_ListTranscriptionJobs.html) API. Run the following command to list the transcription jobs:

{{< command >}}
$ awslocal transcribe list-transcription-jobs
<disable-copy>
{
"TranscriptionJobSummaries": [
{
Expand All @@ -57,12 +68,16 @@ $ awslocal transcribe list-transcription-jobs
}
]
}
</disable-copy>
{{< / command >}}

Once job is complete, the transcript can be retrieved from the S3 bucket:
### View the transcript

After the job is complete, the transcript can be retrieved from the S3 bucket using the [`GetTranscriptionJob`](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_GetTranscriptionJob.html) API. Run the following command to get the transcript:

{{< command >}}
$ awslocal transcribe get-transcription-job --transcription-job example
<disable-copy>
{
"TranscriptionJob": {
"TranscriptionJobName": "example",
Expand All @@ -80,18 +95,24 @@ $ awslocal transcribe get-transcription-job --transcription-job example
"CompletionTime": "2022-08-17T14:04:57.400000+05:30",
}
}

</disable-copy>
$ awslocal s3 cp s3://foo/7844aaa5.json .

$ jq .results.transcripts[0].transcript 7844aaa5.json
<disable-copy>
"it is just a question of getting rid of the illusion that we are separate from nature"
</disable-copy>
{{< / command >}}


## Examples
Serverless Transcription App using Transcribe, S3, Lambda, SQS, SES: [Link](https://github.com/localstack-samples/sample-serverless-transcribe).

The following code snippets and sample applications provide practical examples of how to use Transcribe in LocalStack for various use cases:

- [Serverless Transcription App using Transcribe, S3, Lambda, SQS, SES](https://github.com/localstack-samples/sample-serverless-transcribe)

## Limitations

Currently, our Transcribe emulation offers only supported formats and languages.

### Supported Formats

The following input media formats are supported:
Expand All @@ -108,22 +129,21 @@ The following input media formats are supported:

The following langauges and dialects are supported:

| Language | Language Code |
|----------|---------------|
| German | `de-DE` |
| English, British | `en-GB` |
| English, Indian | `en-IN` |
| English, US | `en-US` |
| Spanish | `es-ES` |
| Farsi | `fa-IR` |
| French | `fr-FR` |
| Hindi | `hi-IN` |
| Italian | `it-IT` |
| Japan | `ja-JP` |
| Dutch | `nl-NL` |
| Portuguese | `pt-BR` |
| Russian | `ru-RU` |
| Turkish | `tr-TR` |
| Vietnamese | `vi-VN` |
| Chinese | `zh-CN` |

| Language | Language Code |
| ---------------- | ------------- |
| German | `de-DE` |
| English, British | `en-GB` |
| English, Indian | `en-IN` |
| English, US | `en-US` |
| Spanish | `es-ES` |
| Farsi | `fa-IR` |
| French | `fr-FR` |
| Hindi | `hi-IN` |
| Italian | `it-IT` |
| Japan | `ja-JP` |
| Dutch | `nl-NL` |
| Portuguese | `pt-BR` |
| Russian | `ru-RU` |
| Turkish | `tr-TR` |
| Vietnamese | `vi-VN` |
| Chinese | `zh-CN` |

0 comments on commit f2f465f

Please sign in to comment.