Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Comparison of Speech Recognition use in Android: by Intent or on-thread?

Introduction

Android provides two ways for me to use speech recognition.

The first way is by an Intent, as in this question: Intent example. A new Activity is pushed onto the top of the stack which listens to the user, hears some speech, attempts to transcribes it (normally via the cloud) then returns the result to my app, via an onActivityResult call.

The second is by getting a SpeechRecognizer, like the code here: SpeechRecognizer example. Here, it looks like the speech is recorded and transcribed on some other thread, then callbacks bring me the results. And this is done without leaving my Activity.

I would like to understand the pros and cons of these two ways of doing speech recognition.

What I've got so far

Using the Intent:

  • is simple to code
  • avoids reinventing the wheel
  • gives consistent user experience of speech recognition across the device

but

  • might be slow for the creation of a new activity with it's own window

Using the SpeechRecognizer:

  • lets me retain control of UI in my app
  • gives me extra possibilities of things to respond to (documentation)

but

  • is limited to be called from the main thread
  • more control requires more error-checking.
like image 905
hcarver Avatar asked Aug 11 '12 10:08

hcarver


People also ask

What is recognizer intent?

Starts an activity that will prompt the user for speech and send it through a speech recognizer. String. ACTION_VOICE_SEARCH_HANDS_FREE. Starts an activity that will prompt the user for speech without requiring the user's visual attention or touch input.

Which algorithm is used in speech recognition in Python?

Once digitized, several models can be used to transcribe the audio to text. Most modern speech recognition systems rely on what is known as a Hidden Markov Model (HMM).


2 Answers

In addition to all this, I'd add at least this point:

SpeechRecognizer is better for hands-free user interfaces, since your app actually gets to respond to error conditions like "No matches" and perhaps restart itself. When you use the Intent, the app beeps and shows a dialog that the user must press to continue.

My summary is as follows:

SpeechRecognizer

  • Show different UI or no UI at all. Do you really want your app's UI to beep? Do you really want your UI to show a dialog when there is an error and wait for user to click?

  • App can do something else while speech recognition is happening

  • Can recognize speech while running in the background or from a service

  • Can Handle errors better

  • Can access low level speech stuff like the raw audio or the RMS. Analyze that audio or use the loudness to make some kind of flashing light to indicate the app is listening

Intent

  • Consistent, and easy to use UI for users
  • Easy to program
like image 171
gregm Avatar answered Oct 21 '22 12:10

gregm


The main difference is UI. SpeechRecognizer doesn't have any so you are responsible for creating one.
I use to wrote a prototype where I've have receiver for listening headset button, then activating speech recognition to listen for some commands. Screen was not activated so I had to use SpeechRecognizer (my UI was some prerecorded sounds and Text To Speech).

Second difference is that SpeechRecognizer has ability for constant listening. Intent version will always end exaction after some period. For example SpeechRecognizer is used by speech recognition "keyboard" so you can dictate a SMS.
In such case you will receive partial results only (in normal mode SpeechRecognizer gives only final results).

like image 42
Marek R Avatar answered Oct 21 '22 12:10

Marek R