I've managed to finally build and run pocketsphinx (pocketsphinx_continuous). The problem I'm running into, is how to a improve accuracy. From what I understand, you can specify a dictionary file (-dict test.dic). So I took the default dictionary file and added some more pronunciations of the same words, for example:
pencil P EH N S AH L
pencil(2) P EH N S IH L
spaghetti S P AH G EH T IY
spaghetti(2) S P UH G EH T IY
Yet pocketsphinx still does not recognize either word at all. I know there is a jsgf file you can specify as well , but that seems more for phrases and grammar. How can I get pocketsphinx to recognize common words such as pencil and spaghetti?
thanks
-Mike
Again, for overall accuracy, only three things are going to really help you: restricting the grammar, adapting the accoustic model, and perhaps getting higher quality recording input.
PocketSphinx is a library that depends on another library called SphinxBase which provides common functionality across all CMUSphinx projects. To install Pocketsphinx, you need to install both Pocketsphinx and Sphinxbase.
August 2022) CMU Sphinx, also called Sphinx for short, is the general term to describe a group of speech recognition systems developed at Carnegie Mellon University. These include a series of speech recognizers (Sphinx 2 - 4) and an acoustic model trainer (SphinxTrain). Sphinx4.
With something like this, you can't be certain, but I can offer the following suggestions:
Perhaps the language model somehow has low probabilities for "spaghetti" and "pencil". As you suggested, you could use a JSGF to test out how it does for recognition if it doesn't use the N-gram models, but instead does a simple grammar (give it like twenty words, including spaghetti and pencil). This way you can see if it is perhaps the language model which makes it difficult to recognize these words, and it can do okay if it considers all the words to have equal probability.
Perhaps you simply pronounce these words poorly, even with the alternative dictionary entries. Try either A. Testing other peoples' voices, or B. Adapting the acoustic model to your voice (see http://cmusphinx.sourceforge.net/wiki/tutorialam)
Also, what is it recognizing them as when it is failing? If possible, remove the words it misrecognizes as from the dictionary.
Again, for overall accuracy, only three things are going to really help you: restricting the grammar, adapting the accoustic model, and perhaps getting higher quality recording input.
To improve accuracy you may want to try adapting the acoustic model to your voice. http://cmusphinx.sourceforge.net/wiki/tutorialadapt
To learn how to add new words: http://ghatage.com/tech/2012/12/13/Make-Pocketsphinx-recognize-new-words/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With