Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Voice Activity Detection from mic input on iOS

I'm developing an iOS app that does voice based AI; i.e. it's meant to take voice input from the microphone, turn it into text, send it to an AI agent, then output the returned text through the speaker. I've got everything working, though using a button to start and stop recording the speech (SpeechKit for voice recognition, API.AI for the AI, Amazon's Polly for the output).

The piece that I need is to have the microphone always on and to automatically start and stop the recording of the user's voice as they begin and end talking. This app is being developed for an unorthodox context, where there will be no access to the screen for the user (but they will have a high-end shotgun mic for recording their text).

My research suggests this piece of the puzzle is known as 'Voice Activity Detection' and seems to be one of the hardest steps in the whole voice-based AI system.

I'm hoping someone can either supply some straightforward (Swift) code to implement this myself, or point me in the direction of some decent libraries / SDKs that I can implement in this project.

like image 423
Mick Byrne Avatar asked Aug 06 '17 05:08

Mick Byrne


People also ask

How do I set up voice recognition on my iPhone?

Go to Settings > Accessibility > Voice Control. Tap Set Up Voice Control, then tap Continue to start the file download. appears in the status bar to indicate Voice Control is turned on.

Does iPhone have speech recognition?

With Dictation on iPhone, you can dictate text anywhere you can type it. You can also use typing and Dictation together—the keyboard stays open during Dictation so you can easily switch between voice and touch to enter text. For example, you can select text with touch and replace it with your voice.

Does Siri use voice recognition?

Your iPhone or iPad will ask you to train Siri to recognize your voice. Tap continue and follow the steps by saying “Hey Siri” three times when prompted to do so.


1 Answers

For good VAD algorithm implementation you can use py-webrtcvad.

It is a Python interface for C code, you can just import C files from the project and use them from swift.

like image 165
Nikolay Shmyrev Avatar answered Nov 09 '22 06:11

Nikolay Shmyrev