Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

CMUSphinx PocketSphinx - Recognize all (or large amount) of words

Before I tried to used PocketSphinx for Android, I used Google's voice recognition API. I didn't need to set a search name or a dictionary file. It just recognized every word that was told.

Now, In PocketSphinx, I need to do it. But I can only find how to set recognition for one word, Or to set dictionary (The ones available in the demo project have only few words) that the recognizer think these are the only words exist, Which means that if someone says something similar, The recognizer thinks its the word that listed in the dictionary.

I just want to ask, How could I set a few search names, Or how could I set it to recognize all the words available (or even a large amount of them)? Maybe someone has a dictionary file with a big number of words?

like image 220
user3184899 Avatar asked Sep 20 '14 13:09

user3184899


1 Answers

Before I tried to used PocketSphinx for Android, I used Google's voice recognition API. I didn't need to set a search name or a dictionary file. It just recognized every word that was told.

Google API recognizes a large but still limited set of words too. For a long time it failed to recognize "Spotify". Google offline speech recognizer uses about 50k words as described in their publication.

I just want to ask, How could I set a few search names, Or how could I set it to recognize all the words available (or even a large amount of them)? Maybe someone has a dictionary file with a big number of words?

Demo includes large vocabulary speech recognition with a language model (forecast part). There are bigger language model for the English language available for download, for example En-US generic language model.

The simple code to run the recognition is like that:

 recognizer = defaultSetup()
   .setAcousticModel(new File(assetsDir, "en-us-ptm"))
   .setDictionary(new File(assetsDir, "cmudict-en-us.dict"))
   .getRecognizer();
  recognizer.addListener(this);

  // Create keyword-activation search.
  recognizer.addNgramSearch(NGRAM_SEARCH, new File(assetsDir, "en-us.lm.bin"););

  // Start the search
  recognizer.startListening(NGRAM_SEARCH);

However, they are not easy to fit into device and decode in realtime. If you want to decode speech in realtime with large vocabulary you need to stream audio to a server. Or you need to restrict the vocabulary and language to some small subset of generic English. You can learn more about speech recognition in CMUSphinx in tutorial.

like image 135
Nikolay Shmyrev Avatar answered Oct 07 '22 18:10

Nikolay Shmyrev