Introduction to Amazon Transcribe

Use Amazon Transcribe to automatically convert speech to text. With Amazon Transcribe, generate meeting notes and subtitles to give workplace meetings more meaning and be more inclusive. In addition, Amazon Transcribe can be used to monitor conversations for inappropriate content and to also generate clinical documentation.

Photo by Oscar Ivan Esquivel Arteaga on Unsplash

The Solution

In this tutorial, we will provide a simple example where we convert a short audio clip of speech to text using Amazon Transcribe with .NET.

Remember, for any example solution from AWS with .NET, we focus on the code that exemplifies the problem we are trying to solve. We don’t include logging, input validation, exception handling, etc., and we embed the configuration data within classes instead of using environment variables, configuration files, key/value stores and the like. These items should not be skipped for proper solutions.

Prerequisites

To complete this solution, you will need the .NET CLI which is included in the .NET SDK. In addition, you will need to create an AWS IAM user with programmatic access with the appropriate permissions to interact with Amazon Transcribe and Amazon S3. In addition, you will need to download the AWS CLI and configure your environment.

Warning: some AWS services may have fees associated with them.

Our Dev Environment

This tutorial was developed/updated using Ubuntu 24.10, .NET 8 SDK and Visual Studio Code 1.95.3. Some commands/constructs may vary across systems.

Creating a .NET Amazon Transcribe App

The first thing we will do is create the .NET “Transcriber” App using the .NET CLI.

$ dotnet new console -n Transcriber --use-program-main

Add App Dependencies

Let’s now add the AWS .NET SDK dependencies.

$ dotnet add package AWSSDK.Core
$ dotnet add package AWSSDK.TranscribeService

Create an AudioTranscriptionService class.

This class will hold the logic for the interactions with Amazon Transcribe.

public class AudioTranscriptionService  
{  
  
}

First let’s set up our fields. We’ll use these values throughout the service.

AmazonTranscribeServiceClient _amazonTranscribeServiceClient = new AmazonTranscribeServiceClient();  
MediaFormat _mediFormat = MediaFormat.Mp3;  
LanguageCode _languageCode = LanguageCode.EnUS;

For this service, we’ll create three methods that will do the heavy lifting when interacting with Amazon Transcribe: StartTranscriptionAsync, GetTranscriptionTextAsync and DeleteTranscriptionAsync.

StartTranscriptionAsync

First, let’s take a look at the StartTranscriptionAsync method. This method will accept string parameters named transcriptionName and audioFileUrl and will return a string typed Task object.

public async Task<string> StartTranscriptionAsync(string transcriptionName, string audioFileUri)  
{  
      
}

The first thing we’ll do in this method is create a StartTranscriptionJobRequest object and we’ll set the TranscriptionJobName property to the value of the transcriptionName parameter that was passed into the method. We’ll also use the value of the audioFileUrl parameter and the value of the _languageCode field to setup the StartTranscriptionJobRequest object.

var startTranscriptionJobRequest = new StartTranscriptionJobRequest()  
{  
    TranscriptionJobName = transcriptionName,  
    Media = new Media()  
    {  
        MediaFileUri = audioFileUri  
    },  
    LanguageCode = _languageCode,  
    Settings = null  
};

We’ll then send the request using the TranscribeServiceClient’s StartTranscription method.

await _amazonTranscribeServiceClient.StartTranscriptionJobAsync(startTranscriptionJobRequest);

Here’s a look at the completed StartTranscriptionAsync method.

public async Task<string> StartTranscriptionAsync(string transcriptionName, string audioFileUri)  
{  
    var startTranscriptionJobRequest = new StartTranscriptionJobRequest()  
    {  
        TranscriptionJobName = transcriptionName,  
        Media = new Media()  
        {  
            MediaFileUri = audioFileUri  
        },  
        LanguageCode = _languageCode,  
        Settings = null  
    };  
  
    await _amazonTranscribeServiceClient.StartTranscriptionJobAsync(startTranscriptionJobRequest);  
  
    return "Transcription, " + transcriptionName + ", started.";  
  
}

GetTranscriptionTextAsync

Next, let’s take a look at GetTranscriptionTextAsync. This method will accept a string typed parameter named transcriptionName and will return a string typed Task object.

public async Task<string> GetTranscriptionTextAsync(string transcriptionName)  
{  
  
}

The first thing we’ll do in this method is create a GetTranscriptionJobRequest object and we’ll set the TranscriptionJobName to the value of the transcriptionName parameter that was passed into the method.

The next step is to create a GetTranscriptionJobResponse object.

var getTranscriptionJobRequest = new GetTranscriptionJobRequest()  
{  
   TranscriptionJobName = transcriptionName  
};  
  
GetTranscriptionJobResponse response;

Now, things will get a little interesting. We’ll use a while loop and we’ll poll for status changes using the GetTransactionJobAsync method of the TranscribeServiceClient. We’ll then inspect the status value of the GetTransactionJobAsync call using common if/else statements. Once we get the status of completed, we will break out of the loop.

while (true)  
{  
    response = await     
        _amazonTranscribeServiceClient.GetTranscriptionJobAsync(getTranscriptionJobRequest);  
  
    if (response.TranscriptionJob.TranscriptionJobStatus == "COMPLETED")  
    {  
        Console.WriteLine("\\nTranscription Complete");  
        break;  
    }  
    else if (response.TranscriptionJob.TranscriptionJobStatus == "FAILED")  
    {  
        Console.WriteLine("Transcription Failed");  
        break;  
    }  
    else if (response.TranscriptionJob.TranscriptionJobStatus == "QUEUED")  
    {  
        Console.WriteLine("Transcription Queued");  
    }  
    else if (response.TranscriptionJob.TranscriptionJobStatus == "IN_PROGRESS")  
    {  
        Console.Write(".");  
    }  
    else  
    {  
        throw new Exception("Invalid Transcription Job Status");  
    }  
  
    System.Threading.Thread.Sleep(500);  
}

Once we exit the while loop, using an HTTP client and the URL of the transcribe file, we can download and parse the transcribed text from the payload. Once the transcribed text is parsed, we’ll return that value, ending the execution of the GetTranscriptionTextAsync method.

string? url = response.TranscriptionJob.Transcript.TranscriptFileUri;  
  
HttpClient httpClient = new HttpClient();  
HttpResponseMessage transcriptionResult = await httpClient.GetAsync(url);  
Stream? resultStream = await transcriptionResult.Content.ReadAsStreamAsync();  
  
JsonDocument? transcriptionObject = await JsonDocument.ParseAsync(resultStream);  
String transcriptionText = transcriptionObject.RootElement  
    .GetProperty("results").GetProperty("transcripts")[0]  
    .GetProperty("transcript").ToString();  
  
return transcriptionText;

Let’s take a look at the GetTranscriptionTextAsync method in it’s entirety.

public async Task<string> GetTranscriptionTextAsync(string transcriptionName)  
{  
  
    var getTranscriptionJobRequest = new GetTranscriptionJobRequest()  
    {  
        TranscriptionJobName = transcriptionName  
    };  
  
    GetTranscriptionJobResponse response;  
  
    while (true)  
    {  
  
        response = await       
             _amazonTranscribeServiceClient.GetTranscriptionJobAsync(getTranscriptionJobRequest);  
  
        if (response.TranscriptionJob.TranscriptionJobStatus == "COMPLETED")  
        {  
            Console.WriteLine("\\nTranscription Complete");  
            break;  
        }  
        else if (response.TranscriptionJob.TranscriptionJobStatus == "FAILED")  
        {  
            Console.WriteLine("Transcription Failed");  
            break;  
        }  
        else if (response.TranscriptionJob.TranscriptionJobStatus == "QUEUED")  
        {  
            Console.WriteLine("Transcription Queued");  
        }  
        else if (response.TranscriptionJob.TranscriptionJobStatus == "IN_PROGRESS")  
        {  
            Console.Write(".");  
        }  
        else  
        {  
            throw new Exception("Invalid Transcription Job Status");  
        }  
  
        System.Threading.Thread.Sleep(500);  
    }  
  
    string? url = response.TranscriptionJob.Transcript.TranscriptFileUri;  
  
    HttpClient httpClient = new HttpClient();  
    HttpResponseMessage transcriptionResult = await httpClient.GetAsync(url);  
    Stream? resultStream = await transcriptionResult.Content.ReadAsStreamAsync();  
  
    JsonDocument? transcriptionObject = await JsonDocument.ParseAsync(resultStream);  
    String transcriptionText = transcriptionObject.RootElement  
        .GetProperty("results").GetProperty("transcripts")[0]  
        .GetProperty("transcript").ToString();  
  
    return transcriptionText;  
  
}

DeleteTranscriptionAsync

The last method to create is the DeleteTranscriptionAsync method that takes a string typed parameter named transcriptionName and will return a string typed Task object.

public async Task<string> DeleteTranscriptionAsync(string transcriptionName)  
{  
  
}

This method will simply call DeleteTranscriptionJobAsync on the client and will use the transcriptionName parameter’s value to specify which transcription job to delete.

public async Task<string> DeleteTranscriptionAsync(string transcriptionName)  
{  
    var deleteTranscriptionJobRequest = new DeleteTranscriptionJobRequest()  
    {  
        TranscriptionJobName = transcriptionName  
    };  
  
    await _amazonTranscribeServiceClient.DeleteTranscriptionJobAsync(deleteTranscriptionJobRequest);  
  
    return "Transcription, " + transcriptionName + ", deleted.";  
  
}

Finishing the .NET Amazon Transcribe App

With the service complete, let’s put the pieces together by running a simple flow in the Main method of the Program.cs file.

Developing the Program.cs File

The next step in developing the .NET Amazon Transcribe application is opening the Program.cs file and changing the “Main” function definition to the following:

static async Task Main(string[] args)

This change allows the application to run asynchronously.

Next, we will set a couple of variables in the Main method of the Program.cs file. The first is the transcription name(transcriptionName) that will be used in all method calls and the second is an s3 URL to an mp3 file which we will transcribe (s3StringURI).

string transcriptionName = "testTranscription";  
string s3StringURI = "s3://<your-s3-mp3-url>";

Next, we will new up an instance of our AudioTranscriptionService class named, audioTranscriptionService.

AudioTranscriptionService audioTranscriptionService = new AudioTranscriptionService();

With the audioTranscriptionService object created, we’ll use the audioTranscriptionService object to start a transcription, get the transrcription text and then finally delete a transcription.

Let’s take a look at the completed Program.cs file.

using System.IO;  
using Amazon.TranscribeService;  
using Amazon.TranscribeService.Model;  
  
namespace Transcriber;  
  
class Program  
{  
    static async Task Main(string[] args)  
    {  
        string transcriptionName = "testTranscription";  
        string s3StringURI = "s3://<your-s3-mp3-url>";  
          
        AudioTranscriptionService audioTranscriptionService = new AudioTranscriptionService();  
  
        string startMessage = await   
           audioTranscriptionService.StartTranscriptionAsync(transcriptionName, s3StringURI);  
  
        Console.WriteLine(startMessage);  
  
        String transcriptionText = await   
           audioTranscriptionService.GetTranscriptionTextAsync(transcriptionName);  
  
        Console.WriteLine("Transcribed text: " + transcriptionText);  
  
        string deleteMessage = await   
           audioTranscriptionService.DeleteTranscriptionAsync(transcriptionName);  
  
        Console.WriteLine(deleteMessage);  
  
    }  
}  

Let’s run the app with the following command:

$ dotnet run

You should then see something like the following in the console.

Transcription, testTranscription, started.  
..........  
Transcription Complete  
Transcribed text: This is a test.  
Transcription, testTranscription, deleted.

Summary

We have concluded this tutorial where you have learned how to:

  • Start an Amazon Transcribe Job.
  • Check the status of an Amazon Transcribe Job.
  • Retrieve text from a completed Amazon Transcribe Job.
  • Delete an Amazon Transcribe Job.

Want to know more about the tech in this article?  Checkout these resources:

.NET CLI.NET SDKAWS .NET SDKAmazon Transcribe