Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does my python script not recognize speech from audio file?

I have the following piece of code successfully recognizing short (less than 1 min) test audio file, but failing with recognition another long audiofile (1.5h).

from google.cloud import speech


def run_quickstart():
    speech_client = speech.Client()
    sample = speech_client.sample(source_uri="gs://linear-arena-2109/zoom0070.flac", encoding=speech.Encoding.FLAC)
    alternatives = sample.recognize('uk-UA')
    for alternative in alternatives:
        print(u'Transcript: {}'.format(alternative.transcript))

    with open("Output.txt", "w") as text_file:
        for alternative in alternatives:
            text_file.write(alternative.transcript.encode('utf8'))

if __name__ == '__main__':
    run_quickstart()

Both files are uploaded to Google Cloud.

The first one: https://storage.googleapis.com/linear-arena-2109/sample.flac

The second one: https://storage.googleapis.com/linear-arena-2109/zoom0070.flac

Both were converted from mp3 with ffmpeg utility:

ffmpeg -i sample.mp3 -ac 1 sample.flac
ffmpeg -i zoom0070.mp3 -ac 1 zoom0070.flac

First file was successfully recognized, but second file outputs the following error:

google.gax.errors.RetryError: GaxError(Exception occurred in retry method that was not classified as transient, caused by <_Rendezvous of RPC that terminated with (StatusCode.INVALID_ARGUMENT, Sync input too long. For audio longer than 1 min use LongRunningRecognize with a 'uri' parameter.)>)

But I have already used uri parameter in my python script. What is wrong?

update

@NieDzejkob helped to understand the error. So, method long_running_recognize should be used instead of recognize. The comprehensive long_running_recognize usage example can be found on the corresponding document page

like image 945
Andriy Avatar asked Jun 29 '17 21:06

Andriy


People also ask

How do you make a speech recognition in Python?

Recognition of Spoken WordsPyaudio − It can be installed by using pip install Pyaudio command. SpeechRecognition − This package can be installed by using pip install SpeechRecognition. Google-Speech-API − It can be installed by using the command pip install google-api-python-client.


1 Answers

For any audio file longer than 1 minute, you need to use Asynchronous Speech Recognition and the file has to be uploaded to Google Cloud Storage so that you can pass in a gcs_uri.

In addition, you will need to use the .long_running_recognize method in your script. An example from GCP documentation can be found here.

I realize that OP figured it out but thought it would be useful to provide an answer and generalize it a bit.

like image 71
0xPeter.eth Avatar answered Sep 29 '22 17:09

0xPeter.eth