Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How Can I Implement Google Voice Typing In My Application?

Tags:

java

android

I am trying to add a button in my application that starts Google Voice Typing (or the default speech recognition). I have tried following this tutorial. This tutorial is incredibly confusing to me. I imported the .jar, and added the necessary permissions, services, and activities to my Manifest. But I can't seem to figure out how to "put it all together". I'm wondering:

  1. Am I supposed to call the inputMethodService from my button click in my Main Activity? Or does my inputMethodService essentially become my Main Activity?
  2. What does IME mean? I tried to Google it, but the definitions it gave me didn't help my understanding.
  3. When I try to copy and paste the whole DemoInputMethodService code into my current activity, I get an error saying I cannot extend InputMethodService inside of this activity. (Which leads back to to ask question one.)

How can I get this to work?

like image 774
Ethan Avatar asked Feb 09 '15 15:02

Ethan


1 Answers

If you want to follow the tutorial that you mention then you need to implement an IME (input method editor) first, see http://developer.android.com/guide/topics/text/creating-input-method.html

This IME can have a regular keyboard look-and-feel or contain just a microphone button.

The user of your app will first have to click on a text field to launch the IME. (Note that there can be several IMEs installed on the device and they have to be explicitly enabled in the Settings.) Then the user will have to click on the microphone button to trigger the speech recognition.

The tutorial provides a jar that lets you directly call Google's recognizer. It would be nicer if instead you called the recognizer via the SpeechRecognizer-interface (http://developer.android.com/reference/android/speech/SpeechRecognizer.html), this way the user can decide whether to use Google's or something else.

The SpeechRecognizer is given a listener which supports the method onPartialResults, which allows you to monitor the recognition hypotheses while the user is speaking. It's up to you how you display them. Note however that the specification of SpeechRecognizer does not promise that this method gets called. This depends on the implementation of the recognizer service. Regarding Google's implementation: what it supports keeps changing unannounced, it does not have a public API nor even release notes.

You might be able to reuse my project Kõnele (http://kaljurand.github.io/K6nele/about/), which contains two implementations of SpeechRecognizer and an IME that uses them. One of the implementations offers continuous recognition of arbitrarily long audio input, using the Kaldi GStreamer server (https://github.com/alumae/kaldi-gstreamer-server). You would need to set up your own instance of the server porting it to the language that you want to recognize (unless you want to use the Estonian server that Kõnele uses by default).

like image 73
Kaarel Avatar answered Sep 27 '22 21:09

Kaarel