diff --git a/content/en/user-guide/aws/transcribe/index.md b/content/en/user-guide/aws/transcribe/index.md
index aa04d2627e..fc36ec8cfe 100644
--- a/content/en/user-guide/aws/transcribe/index.md
+++ b/content/en/user-guide/aws/transcribe/index.md
@@ -1,39 +1,49 @@
---
title: "Transcribe"
linkTitle: "Transcribe"
-description: >
- Get started with Amazon Transcribe on LocalStack
+description: Get started with Amazon Transcribe on LocalStack
---
## Introduction
-LocalStack supports Transcribe via the Community offering, allowing you to use the Transcribe APIs in your local environment. The supported APIs are available on our [API Coverage Page](https://docs.localstack.cloud/references/coverage/coverage_transcribe/), which provides information on the extent of Transcribe integration with LocalStack.
+Transcribe is a service provided by Amazon Web Services (AWS) that offers automatic speech recognition (ASR) capabilities. It enables developers to convert spoken language into written text, making it valuable for a wide range of applications, from transcription services to voice analytics.
-LocalStack's Transcribe builds on the offline speech-to-text service [Vosk](https://alphacephei.com/vosk/). Therefore, LocalStack requires an internet connection the first time a transcription job is created for a given language to download and cache the model.
-Subsequent transcriptions for the same language can be done offline.
-Language models are around 50 MiB each and saved to the cache directory (see [Filesystem Layout]({{< ref "filesystem" >}})).
+LocalStack supports Transcribe via the Community offering, allowing you to use the Transcribe APIs for offline speech-to-text jobs in your local environment. The supported APIs are available on our [API Coverage Page](https://docs.localstack.cloud/references/coverage/coverage_transcribe/), which provides information on the extent of Transcribe integration with LocalStack.
+
+{{< alert title="Note">}}
+LocalStack's Transcribe relies on the offline speech-to-text service called [Vosk](https://alphacephei.com/vosk/). Therefore, LocalStack requires an internet connection during the initial creation of a transcription job for a specific language. This initial connection is required to download and cache the language model.
+
+Once the language model is cached, subsequent transcriptions for the same language can be performed offline. These language models typically have a size of around 50 MiB, and they are saved to the cache directory (for more details, refer to the [Filesystem Layout]({{< ref "filesystem" >}}) section).
+{{< /alert >}}
+
+## Getting Started
+
+This guide is designed for users new to Transcribe and assumes basic knowledge of the AWS CLI and our [`awslocal`](https://github.com/localstack/awscli-local) wrapper script.
+
+Start your LocalStack container using your preferred method. We will demonstrate how to create a transcription job and view the transcript in an S3 bucket using the AWS CLI.
{{< alert title="Note" >}}
-This service has limited support for aarch64/Apple Silicon.
+This service offers limited support for aarch64/Apple Silicon platforms.
+
+If you encounter errors like `cannot load library *.so`, we recommend trying the AMD64 build of LocalStack as an alternative solution. Run the following command to pull the AMD64 build of LocalStack:
-If you encounter `cannot load library *.so` errors, please try the AMD64 build of LocalStack:
{{< command >}}
$ docker pull localstack/localstack:2.0.0 --platform amd64
{{< /command >}}
{{< /alert >}}
+### Create an S3 bucket
-## Getting Started
-
-Create an S3 bucket and upload the audio file:
+You can create an S3 bucket using the [`mb`](https://docs.aws.amazon.com/cli/latest/reference/s3/mb.html) command. Run the following command to create a bucket named `foo` to upload a sample audio file named `example.wav`:
{{< command >}}
$ awslocal s3 mb s3://foo
-
$ awslocal s3 cp ~/example.wav s3://foo/example.wav
{{< / command >}}
-Create the transcription job:
+### Create a transcription job
+
+You can create a transcription job using the [`StartTranscriptionJob`](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_StartTranscriptionJob.html) API. Run the following command to create a transcription job named `example` for the audio file `example.wav`:
{{< command >}}
$ awslocal transcribe start-transcription-job \
@@ -42,10 +52,11 @@ $ awslocal transcribe start-transcription-job \
--language-code en-IN
{{< / command >}}
-Jobs can be listed like so:
+You can list the transcription jobs using the [`ListTranscriptionJobs`](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_ListTranscriptionJobs.html) API. Run the following command to list the transcription jobs:
{{< command >}}
$ awslocal transcribe list-transcription-jobs
+
{
"TranscriptionJobSummaries": [
{
@@ -57,12 +68,16 @@ $ awslocal transcribe list-transcription-jobs
}
]
}
+
{{< / command >}}
-Once job is complete, the transcript can be retrieved from the S3 bucket:
+### View the transcript
+
+After the job is complete, the transcript can be retrieved from the S3 bucket using the [`GetTranscriptionJob`](https://docs.aws.amazon.com/transcribe/latest/APIReference/API_GetTranscriptionJob.html) API. Run the following command to get the transcript:
{{< command >}}
$ awslocal transcribe get-transcription-job --transcription-job example
+
{
"TranscriptionJob": {
"TranscriptionJobName": "example",
@@ -80,18 +95,24 @@ $ awslocal transcribe get-transcription-job --transcription-job example
"CompletionTime": "2022-08-17T14:04:57.400000+05:30",
}
}
-
+
$ awslocal s3 cp s3://foo/7844aaa5.json .
-
$ jq .results.transcripts[0].transcript 7844aaa5.json
+
"it is just a question of getting rid of the illusion that we are separate from nature"
+
{{< / command >}}
-
## Examples
-Serverless Transcription App using Transcribe, S3, Lambda, SQS, SES: [Link](https://github.com/localstack-samples/sample-serverless-transcribe).
+
+The following code snippets and sample applications provide practical examples of how to use Transcribe in LocalStack for various use cases:
+
+- [Serverless Transcription App using Transcribe, S3, Lambda, SQS, SES](https://github.com/localstack-samples/sample-serverless-transcribe)
## Limitations
+
+Currently, our Transcribe emulation offers only supported formats and languages.
+
### Supported Formats
The following input media formats are supported:
@@ -108,22 +129,21 @@ The following input media formats are supported:
The following langauges and dialects are supported:
-| Language | Language Code |
-|----------|---------------|
-| German | `de-DE` |
-| English, British | `en-GB` |
-| English, Indian | `en-IN` |
-| English, US | `en-US` |
-| Spanish | `es-ES` |
-| Farsi | `fa-IR` |
-| French | `fr-FR` |
-| Hindi | `hi-IN` |
-| Italian | `it-IT` |
-| Japan | `ja-JP` |
-| Dutch | `nl-NL` |
-| Portuguese | `pt-BR` |
-| Russian | `ru-RU` |
-| Turkish | `tr-TR` |
-| Vietnamese | `vi-VN` |
-| Chinese | `zh-CN` |
-
+| Language | Language Code |
+| ---------------- | ------------- |
+| German | `de-DE` |
+| English, British | `en-GB` |
+| English, Indian | `en-IN` |
+| English, US | `en-US` |
+| Spanish | `es-ES` |
+| Farsi | `fa-IR` |
+| French | `fr-FR` |
+| Hindi | `hi-IN` |
+| Italian | `it-IT` |
+| Japan | `ja-JP` |
+| Dutch | `nl-NL` |
+| Portuguese | `pt-BR` |
+| Russian | `ru-RU` |
+| Turkish | `tr-TR` |
+| Vietnamese | `vi-VN` |
+| Chinese | `zh-CN` |