I have the following piece of code successfully recognizing short (less than 1 min) test audio file, but failing with recognition another long audiofile (1.5h). <pre class="prettyprint lang-py prettyprint-override"><code>from google.cloud import speech def run_quickstart(): speech_client = speech.Client() sample = speech_client.sample(source_uri="gs://linear-arena-2109/zoom0070.flac", encoding=speech.Encoding.FLAC) alternatives = sample.recognize('uk-UA') for alternative in alternatives: print(u'Transcript: {}'.format(alternative.transcript)) with open("Output.txt", "w") as text_file: for alternative in alternatives: text_file.write(alternative.transcript.encode('utf8')) if __name__ == '__main__': run_quickstart() </code></pre> Both files are uploaded to Google Cloud. The first one: https://storage.googleapis.com/linear-arena-2109/sample.flac The second one: https://storage.googleapis.com/linear-arena-2109/zoom0070.flac Both were converted from mp3 with <code>ffmpeg</code> utility: <pre class="prettyprint"><code>ffmpeg -i sample.mp3 -ac 1 sample.flac ffmpeg -i zoom0070.mp3 -ac 1 zoom0070.flac </code></pre> First file was successfully recognized, but second file outputs the following error: <pre class="prettyprint"><code>google.gax.errors.RetryError: GaxError(Exception occurred in retry method that was not classified as transient, caused by <_Rendezvous of RPC that terminated with (StatusCode.INVALID_ARGUMENT, Sync input too long. For audio longer than 1 min use LongRunningRecognize with a 'uri' parameter.)>) </code></pre> But I have already used <code>uri</code> parameter in my python script. What is wrong? update @NieDzejkob helped to understand the error. So, method <code>long_running_recognize</code> should be used instead of <code>recognize</code>. The comprehensive <code>long_running_recognize</code> usage example can be found on the corresponding document page

For any audio file longer than 1 minute, you need to use Asynchronous Speech Recognition and the file has to be uploaded to Google Cloud Storage so that you can pass in a <code>gcs_uri</code>. In addition, you will need to use the <code>.long_running_recognize</code> method in your script. An example from GCP documentation can be found here. I realize that OP figured it out but thought it would be useful to provide an answer and generalize it a bit.

Why does my python script not recognize speech from audio file?

Tags:

google-cloud-speech

I have the following piece of code successfully recognizing short (less than 1 min) test audio file, but failing with recognition another long audiofile (1.5h).

from google.cloud import speech


def run_quickstart():
    speech_client = speech.Client()
    sample = speech_client.sample(source_uri="gs://linear-arena-2109/zoom0070.flac", encoding=speech.Encoding.FLAC)
    alternatives = sample.recognize('uk-UA')
    for alternative in alternatives:
        print(u'Transcript: {}'.format(alternative.transcript))

    with open("Output.txt", "w") as text_file:
        for alternative in alternatives:
            text_file.write(alternative.transcript.encode('utf8'))

if __name__ == '__main__':
    run_quickstart()

Both files are uploaded to Google Cloud.

The first one: https://storage.googleapis.com/linear-arena-2109/sample.flac

The second one: https://storage.googleapis.com/linear-arena-2109/zoom0070.flac

Both were converted from mp3 with ffmpeg utility:

ffmpeg -i sample.mp3 -ac 1 sample.flac
ffmpeg -i zoom0070.mp3 -ac 1 zoom0070.flac

First file was successfully recognized, but second file outputs the following error:

google.gax.errors.RetryError: GaxError(Exception occurred in retry method that was not classified as transient, caused by <_Rendezvous of RPC that terminated with (StatusCode.INVALID_ARGUMENT, Sync input too long. For audio longer than 1 min use LongRunningRecognize with a 'uri' parameter.)>)

But I have already used uri parameter in my python script. What is wrong?

update

@NieDzejkob helped to understand the error. So, method long_running_recognize should be used instead of recognize. The comprehensive long_running_recognize usage example can be found on the corresponding document page

945

asked Jun 29 '17 21:06

Andriy

1 Answers

For any audio file longer than 1 minute, you need to use Asynchronous Speech Recognition and the file has to be uploaded to Google Cloud Storage so that you can pass in a gcs_uri.

In addition, you will need to use the .long_running_recognize method in your script. An example from GCP documentation can be found here.

I realize that OP figured it out but thought it would be useful to provide an answer and generalize it a bit.

answered Sep 29 '22 17:09

0xPeter.eth

Related questions
                            
                                how to play .opus audio file in android?
                            
                                Google cloud speech syncrecognize "INVALID_ARGUMENT"
                            
                                400 Specify MP3 encoding to match audio file
                            
                                Google Cloud Text-to-Speech API - permission error
                            
                                Failed to detect whether we are running on Google Compute Engine
                            
                                google cloud speech api returning empty result
                            
                                error in Request for Google Cloud Speech API request
                            
                                Pycharm: set environment variable for google service account key (json credential)
                            
                                Google Cloud Storage is giving me Authorization code 401 Invalid Credentials when trying to connect
                            
                                Google cloud speech API not transcription from base64 audio

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With