Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Google Speech Recognition timeout

I am developing an Android Application that is based around Speech Recognition.

Until today everything has been working fine and in a timely manner, e.g. I would start my speech recogniser, speak, and within 1 or 2 seconds max the application received the results.

It was a VERY acceptable user experience.

Then today I now have to wait for ten or more seconds before the recognition results are available.

I have tried setting the following EXTRAS, none of which make any discernible difference

RecognizerIntent.EXTRA_SPEECH_INPUT_POSSIBLY_COMPLETE_SILENCE_LENGTH_MILLIS RecognizerIntent.EXTRA_SPEECH_INPUT_COMPLETE_SILENCE_LENGTH_MILLIS RecognizerIntent.EXTRA_SPEECH_INPUT_MINIMUM_LENGTH_MILLIS 

I have been continually changing my application, however none of these changes were related to the speech recogniser.

Is there any method I can employ to reduce the time between the speech recogniser switching from onBeginningOfSpeech() to onResults()?

Heres an example of how long it takes

07-01 17:50:20.839 24877-24877/com.voice I/Voice: onReadyForSpeech() 07-01 17:50:21.614 24877-24877/com.voice I/Voice: onBeginningOfSpeech() 07-01 17:50:38.163 24877-24877/com.voice I/Voice: onEndOfSpeech() 
like image 432
Hector Avatar asked Jul 01 '16 16:07

Hector


People also ask

Is Google speech recognition accurate?

Key features: The key features of Google Speech-to-Text API include: High accuracy: It has an accuracy rate of 80-85%. Transcription capabilities: It can transcribe audio in 125+ languages and variants, including pre-recorded and real-time audio.

How does Google ASR work?

A Speech-to-Text API synchronous recognition request is the simplest method for performing recognition on speech audio data. Speech-to-Text can process up to 1 minute of speech audio data sent in a synchronous request. After Speech-to-Text processes and recognizes all of the audio, it returns a response.

How do I use Google text to speech offline?

You can activate this by going to Settings - Language and Input - Voice Input and touch the cog icon next to Enhanced Google Services. Choose "Offline Speech Recognition" and select the "All" tab to download your preferred language if it's not already installed.


2 Answers

EDIT - Has apparently been fixed in the August 2016 coming release You can test the beta to confirm.

This is a bug with the release of Google 'Now' V6.0.23.* and persists in the latest V6.1.28.*

Since the release of V5.11.34.* Google's implementation of the SpeechRecognizer has been plagued with bugs.

You can use this gist to replicate many of them.

You can use this BugRecognitionListener to work around some of them.

I have reported these directly to the Now team, so they are aware, but as yet, nothing has been fixed. There is no external bug tracker for Google Now, as it's not part of AOSP, so nothing you can star I'm afraid.

The most recent bug you detail pretty much makes their implementation unusable, as you correctly point out, the parameters to control the speech input timings are ignored. Which according to the documentation:

Additionally, depending on the recognizer implementation, these values may have no effect.

is something we should expect......

The recognition will continue indefinitely if you don't speak or make any detectable sound.

I'm currently creating a project to replicate this new bug and all of the others, which I'll forward on and link here shortly.

EDIT - I was hoping I could create a workaround that used the detection of partial or unstable results as the trigger to know that the user was still speaking. Once they stopped, I could manually call recognizer.stopListening() after a set period of time.

Unfortunately, stopListening() is broken too and doesn't actually stop the recognition, therefore there is no workaround to this.

Attempts around the above, of destroying the recognizer and relying only on the partial results up until that point (when destroying the recognizer onResults() is not called) failed to produce a reliable implementation, unless you're simply keyword spotting.

There is nothing we can do until Google fix this. Your only outlet is to email [email protected] reporting the problem and hope that the volume they receive gives them a nudge.....

like image 63
brandall Avatar answered Sep 20 '22 23:09

brandall


NOTE! this works only in online mode. Enable dictation mode and disable partial results:

intent.putExtra("android.speech.extra.DICTATION_MODE", true); intent.putExtra(RecognizerIntent.EXTRA_PARTIAL_RESULTS, false); 

In dictation mode speechRecognizer would still call onPartialResults() however you should treat the partials as final results.

like image 29
vladbph Avatar answered Sep 21 '22 23:09

vladbph