Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Google Speech Recognition API Result is Empty

I'm performing an asynchronous request to Google Cloud Speech API, and I do not know how to get the result of operation:

Request POST: https://speech.googleapis.com/v1beta1/speech:asyncrecognize

Body:

{
    "config":{
                 "languageCode" : "pt-BR",
                 "encoding" : "LINEAR16",
                 "sampleRate" : 16000
             },
     "audio":{
                 "uri":"gs://bucket/audio.flac"
             }
}

Which returns:

{ "name": "469432517" }

So, I do a POST: https://speech.googleapis.com/v1beta1/operations/469432517

Which returns:

{
    "name": "469432517",
    "metadata": {
        "@type": "type.googleapis.com/google.cloud.speech.v1beta1.AsyncRecognizeMetadata",
        "progressPercent": 100,
        "startTime": "2016-08-11T21:18:29.985053Z",
        "lastUpdateTime": "2016-08-11T21:18:31.888412Z"
    },
    "done": true,
    "response": {
                    "@type": "type.googleapis.com/google.cloud.speech.v1beta1.AsyncRecognizeResponse"
                }
}

I need to get the result of the operation: the transcribed text.

How can I do that?

like image 887
Bruno Avatar asked Aug 11 '16 21:08

Bruno


People also ask

Is Google TTS offline?

The Speech-to-Text On-Prem allows you to deploy the Speech to Text API through a container or any GKE cluster, but that doesn't mean you can do it in your local desktop. On Android the Google's Speech-To-Text is working offline...

Is Google speech recognition accurate?

Google Speech-to-Text API High accuracy: It has an accuracy rate of 80-85%. Transcription capabilities: It can transcribe audio in 125+ languages and variants, including pre-recorded and real-time audio.

Is Google Text-to-Speech API free?

Text-to-Speech is priced based on the number of characters sent to the service to be synthesized into audio each month. You must enable billing to use Text-to-Speech, and will be automatically charged if your usage exceeds the number of free characters allowed per month.


1 Answers

You've got the result of the operation and it is empty. The reason of the empty result is format mismatch. You should have submitted "LINEAR16" file (PCM uncompressed data, basically WAV file) and you try to submit FLAC (compressed format).

Other reason of the empty result might be incorrect sample rate, incorrect number of channels and so on.

Last, the file with pure silence will result in empty response.

like image 184
Nikolay Shmyrev Avatar answered Oct 04 '22 13:10

Nikolay Shmyrev