Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Open Source Speech Recognition Software in Java [closed]

I've been thinking lately to start an application based on Speech recognition. Meaning on certain results to do specific tasks. I was wondering what is the best way to proceed. I'm thinking either for PC or Android also. I would consider JAVA as my strong programming language.

I've done some searching but still I don't know which is the best way to approach this.

Have an open software do the speech recognition part for me and work on the other part? Do the whole thing by myself? And if yes is it possible in JAVA?

Any info will be appreciated.

Thank you in advance.

like image 929
LefterisL Avatar asked Aug 22 '13 18:08

LefterisL


2 Answers

The best way to approach this would be use an existing recognition toolkit and the language and acoustic models that come with it. You may train the models to fit your needs.

CMUSphinx is probably the best FOSS speech recognition toolkit out there. CMUSphinx also provides good Java integration and demo applications.

like image 69
Uku Loskit Avatar answered Nov 09 '22 15:11

Uku Loskit


After evaluating several 3rd party speech recognition options, Google voice recognition is by far the most accurate. There are two basic approaches when using Google voice recognition. The easiest is to launch an Intent and handle the results accordingly:

    Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);

    intent.addFlags(Intent.FLAG_ACTIVITY_CLEAR_TOP);
    intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);

    startActivityForResult(intent, VOICE_RECOGNITION_REQUEST_CODE );

then in your onActivityResults(), you would handle the matches returned by the service:

    /**
 * Handle the results from the recognition activity.
 */
@Override
protected void onActivityResult(int requestCode, int resultCode, Intent data) {
    super.onActivityResult(requestCode, resultCode, data);
    //Toast.makeText(this, "voice recog result: " + resultCode, Toast.LENGTH_LONG).show();
    if (requestCode == VOICE_RECOGNITION_REQUEST_CODE && resultCode == RESULT_OK) {
        // Fill the list view with the strings the recognizer thought it could have heard
        ArrayList<String> matches = data.getStringArrayListExtra(
                RecognizerIntent.EXTRA_RESULTS);
        // handleResults
        if (matches != null) {
            handleResults(matches); 
        }                    
    }     
}

The second approach is more involved but allows for better handling of an error condition that can happen while the recognition service is running. Using this approach, you would create your own recognition listener and callback methods. For example:

start listening:

mSpeechRecognizer.startListening(mRecognizerIntent);

where mRecognizerIntent:

    mSpeechRecognizer = SpeechRecognizer.createSpeechRecognizer(getBaseContext());
    mSpeechRecognizer.setRecognitionListener(mRecognitionListener);
    mRecognizerIntent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
    mRecognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
            RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
    mRecognizerIntent.putExtra("calling_package", "com.you.package");

then, create your listener:

    private RecognitionListener mRecognitionListener = new RecognitionListener() {
            public void onBufferReceived(byte[] buffer) {
                    // TODO Auto-generated method stub
                    //Log.d(TAG, "onBufferReceived");
            }

            public void onError(int error) {
                    // TODO Auto-generated method stub
                    // here is where you handle the error...


            public void onEvent(int eventType, Bundle params) {
                    // TODO Auto-generated method stub
                    Log.d(TAG, "onEvent");
            }

            public void onPartialResults(Bundle partialResults) {
                    // TODO Auto-generated method stub
                    Log.d(TAG, "onPartialResults");
            }

            public void onReadyForSpeech(Bundle params) {
                    // TODO Auto-generated method stub
                    Log.d(TAG, "onReadyForSpeech");

            }

            public void onResults(Bundle results) {

                    Log.d(TAG, ">>> onResults");
                    //Toast.makeText(getBaseContext(), "got voice results!", Toast.LENGTH_SHORT);

                    ArrayList<String> matches = results.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
                    handleResults(matches);


            }

            public void onRmsChanged(float rmsdB) {
                    // TODO Auto-generated method stub
                    //Log.d(TAG, "onRmsChanged");
            }

            public void onBeginningOfSpeech() {
                    // TODO Auto-generated method stub
                    Log.d(TAG, "onBeginningOfSpeech");
            }

            public void onEndOfSpeech() {
                    // TODO Auto-generated method stub
                    Log.d(TAG, "onEndOfSpeech");

            }

};

you can add your handleResults() to do whatever you want.

like image 42
droideckar Avatar answered Nov 09 '22 13:11

droideckar