Voice Activity Detection from mic input on iOS

Tags:

I'm developing an iOS app that does voice based AI; i.e. it's meant to take voice input from the microphone, turn it into text, send it to an AI agent, then output the returned text through the speaker. I've got everything working, though using a button to start and stop recording the speech (SpeechKit for voice recognition, API.AI for the AI, Amazon's Polly for the output).

The piece that I need is to have the microphone always on and to automatically start and stop the recording of the user's voice as they begin and end talking. This app is being developed for an unorthodox context, where there will be no access to the screen for the user (but they will have a high-end shotgun mic for recording their text).

My research suggests this piece of the puzzle is known as 'Voice Activity Detection' and seems to be one of the hardest steps in the whole voice-based AI system.

I'm hoping someone can either supply some straightforward (Swift) code to implement this myself, or point me in the direction of some decent libraries / SDKs that I can implement in this project.

423

asked Aug 06 '17 05:08

Mick Byrne

1 Answers

For good VAD algorithm implementation you can use py-webrtcvad.

It is a Python interface for C code, you can just import C files from the project and use them from swift.

165

answered Nov 09 '22 06:11

Nikolay Shmyrev

Related questions
                            
                                Import 3D object with animation from scn
                            
                                Animations not stopping after view controller is dismissed using tab bar
                            
                                swipeUp() on XCUIApplication breaks the XCUIApplication in UITest
                            
                                Implementing a protocol method inside of a subclass
                            
                                How to use UIPageViewController to also display a part of the previous and next views
                            
                                How to close SFSafariViewController automatically when reaching a certain page
                            
                                Generic Class does not forward delegate calls to concrete subclass
                            
                                Automatically adjust content insets for views with custom a tab bar
                            
                                Ionic iOS App Error - Custom Scheme URIs are not allowed for 'WEB' client type
                            
                                Call method on subclass to UITableViewCell with generic parameter value based on protocol or base class
                            
                                Determine UIStackView contentsize
                            
                                What is the ios equivalent to android xml? [closed]
                            
                                Cant see view details button in xcode
                            
                                Centering cells for paging UICollectionView
                            
                                Horizontal Linear Progress Bar like Android in iOS
                            
                                Swift, Check if particular website reachable
                            
                                Swift: Agar.io-like smooth SKCameraNode movement?
                            
                                Why can't I set the speed of a GKAgent2D in iOS9?
                            
                                Is it possible to use iOS 11 Drag and Drop to reorder multiple items/cells at a time in UITableView?
                            
                                Localized app name vs interface language mismatch

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Voice Activity Detection from mic input on iOS

Tags:

ios

artificial-intelligence

swift

voice-recognition

voice-recording

Mick Byrne

People also ask

1 Answers

Nikolay Shmyrev

Recent Activity

Donate For Us