Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I do real-time voice activity detection in Python?

I am performing a voice activity detection on the recorded audio file to detect speech vs non-speech portions in the waveform.

The output of the classifier looks like (highlighted green regions indicate speech):

enter image description here

The only issue I face here is making it work for a stream of audio input (for eg: from a microphone) and do real-time analysis for a stipulated time-frame.

I know PyAudio can be used to record speech from the microphone dynamically and there a couple of real-time visualization examples of a waveform, spectrum, spectrogram, etc, but could not find anything relevant to carrying out feature extraction in a near real-time manner.

like image 745
Nickil Maveli Avatar asked Mar 24 '20 13:03

Nickil Maveli


People also ask

Can python be used for voice recognition?

It allows computers to understand human language. Speech recognition is a machine's ability to listen to spoken words and identify them. You can then use speech recognition in Python to convert the spoken words into text, make a query or give a reply. You can even program some devices to respond to these spoken words.

Which library is used for voice recognition in python?

For performing Speech Recognition, there is SpeechRecognition library which is open source and the best thing is that several engines and APIs provide in both modes online and offline mode. For Speech Recognition is in python.


1 Answers

You should try using Python bindings to webRTC VAD from Google. It's lightweight, fast and provides very reasonable results, based on GMM modelling. As the decision is provided per frame, the latency is minimal.

# Run the VAD on 10 ms of silence. The result should be False.
import webrtcvad
vad = webrtcvad.Vad(2)

sample_rate = 16000
frame_duration = 10  # ms
frame = b'\x00\x00' * int(sample_rate * frame_duration / 1000)
print('Contains speech: %s' % (vad.is_speech(frame, sample_rate))

Also, this article might be useful for you.

like image 138
igrinis Avatar answered Sep 18 '22 06:09

igrinis