Does TensorFlow Audio/Speech Recognition work with multi-word trigger keywords?

Question

Related link: https://www.tensorflow.org/tutorials/sequences/audio_recognition

How should I modify my TensorFlow "Simple Audio Recognition" training environment (number of input samples, choice of trigger keywords, training parameters, etc.) to get a robust recognition of a unique trigger keyword (multi-words or single-word) in a normal conversation?

The original TensorFlow "Simple Audio Recognition" comes with 10 single trigger keywords, each 1 second in duration. To avoid single trigger keywords to get detected in a normal conversation and cause false positives, I have recorded 400 times (100 times 4 different people) the following two multi-worded trigger keywords, each 1.5 seconds in duration: PLAY MUSIC, STOP MUSIC. After following the exact same training steps and compensating for the new 1.5 seconds duration in the code, I am getting 100 % recognition of these two multi-worded trigger keywords when pronounced correctly; however, further testing also shows that I am getting false positives during normal speech when any work of these trigger keywords is pronounced, e.g. STOP BLA BLA BLA, STOP VIDEO, PLAY BLA BLA BLA, PLAY VIDEO, etc.

Thank you for your kind response, PM

Nikolay Shmyrev · Accepted Answer

You should have added garbage speech into training dataset, not sure if you did that.

For very long phrases, it is more reliable to detect smaller chunks and ensure they all are present - i.e. to have a separate detector for "play" and for "music".

For example, Google separately detects "ok" and "google" in their "ok google" as described in SMALL-FOOTPRINT KEYWORD SPOTTING USING DEEP NEURAL NETWORKS .

Does TensorFlow Audio/Speech Recognition work with multi-word trigger keywords?

Tags:

tensorflow

triggers

speech-recognition

P. Montazemi

1 Answers

Nikolay Shmyrev

Recent Activity

Donate For Us

Does TensorFlow Audio/Speech Recognition work with multi-word trigger keywords?

Tags:

tensorflow

triggers

speech-recognition

P. Montazemi

1 Answers

Nikolay Shmyrev

Related questions

Recent Activity

Donate For Us