Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Keyword Spotting in Speech [closed]

Is anyone aware of a Keyword Spotting System that is freely available, and possibly providing APIs ??

CMU Sphinx 4 and MS Speech API are speech recognition engines, and cannot be used for KWS.

SRI has a keyword spotting system, but no download links, not even for evaluation. (I even couldn't find anywhere a link to contact them for their software)

I found one here but it's a demo and limited.

like image 680
FearUs Avatar asked Mar 03 '11 17:03

FearUs


People also ask

How does keyword spotting work?

Keyword spotting uses speech recognition to identify your company's target keywords and phrases. Keywords can be customized to your company's specific needs and preferences. You choose what phrases or words you want to act as triggers and the appropriate action to be taken once identified.


1 Answers

CMUSphinx implements keyword spotting in pocketsphinx engine, see for details the FAQ entry.

To recognize a single keyphrase you can run decoder in “keyphrase search” mode.

From command line try:

pocketsphinx_continuous -infile file.wav -keyphrase “oh mighty computer” -kws_threshold 1e-20

From the code:

 ps_set_keyphrase(ps, "keyphrase_search", "oh mighty computer");
 ps_set_search(ps, "keyphrase_search);
 ps_start_utt();
 /* process data */

You can also find examples for Python and Android/Java in our sources. Python code looks like this, full example here:

# Process audio chunk by chunk. On keyphrase detected perform action and restart search
decoder = Decoder(config)
decoder.start_utt()
while True:
    buf = stream.read(1024)
    if buf:
         decoder.process_raw(buf, False, False)
    else:
         break
    if decoder.hyp() != None:
        print ([(seg.word, seg.prob, seg.start_frame, seg.end_frame) for seg in decoder.seg()])
        print ("Detected keyphrase, restarting search")
        decoder.end_utt()
        decoder.start_utt()

Threshold must be tuned for every keyphrase on a test data to get the right balance missed detections and false alarms. You can try values like 1e-5 to 1e-50.

For the best accuracy it is better to have keyphrase with 3-4 syllables. Too short phrases are easily confused.

You can also search for multiple keyphrase, create a file keyphrase.list like this:

  oh mighty computer /1e-40/
  hello world /1e-30/
  other_phrase /other_phrase_threshold/

And use it in decoder with -kws configuration option.

  pocketsphinx_continuous -inmic yes -kws keyphrase_list

This feature is not yet implemented in sphinx4 decoder.

like image 147
Nikolay Shmyrev Avatar answered Oct 21 '22 05:10

Nikolay Shmyrev