![]() Received BLOB format is converted to base64 which can be easily saved as an archived conversation, or optionally sent to S3 storage. RecordRTC library uses the MediaRecorder Browser API to record voice from microphone. Recording audio as base64Īs an additional feature, we’ve implemented saving audio as a base64 audio file. Currently, ‘en-US’ supports sample rates up to 48,000 Hz, and this value was optimal during our tests. The sample rate is also important, having better quality of voice means we will receive better results. It expects audio to be encoded as PCM data. To achieve good results of Speech to Text recognition, we need to provide a proper audio format that is sent to AWS Transcribe API. AWS Transcribe currently supports over 30 languages, more info at: Audio data format The most popular language – English – uses lang code: ‘en-US’. ![]() In the config file we can specify the language code for our audio conversation. This demo will focus on streaming audio where we can see live text recognized returned from API. There are two modes we can use: uploading an audio file which will be added as a transcription job and wait for results or live streaming using websocket where the response is instant. Animated GIF ASR-streaming-demo.gif presents what we are going to build. It uses the AWS SDK – Client Transcribe Streaming package to connect to the Amazon Transcribe service using web socket. In this example, we’re going to create a React Component that can be reused in your application. ASR – automatic speech recognition – uses advanced machine learning solutions to analyze the context of speech and return text data. ![]() It’s useful in preparing subtitles or archiving conversation in text mode. Transcribing live streamed audio to text has become more and more popular. Audio saved in Chrome is missing duration. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |