I am trying to process an audio file in python using various modules like numpy, struct etc. But I am really having a hard time detecting silence in the file, as in where is the presence of silence. one on the methods I came across was to slide a window of fixed time interval over my audio signal and record the sum of squared elements. I am new to python and hardly aware of it thus unable to implement this method.
For better result use dBFS
from pydub import AudioSegment,silence
myaudio = intro = AudioSegment.from_mp3("RelativityOverview.mp3")
dBFS=myaudio.dBFS
silence = silence.detect_silence(myaudio, min_silence_len=1000, silence_thresh=dBFS-16)
silence = [((start/1000),(stop/1000)) for start,stop in silence] #in sec
print(silence)
If you are open to outside libraries, one of the quick ways to do this is using pydub.
pydub has a module called silence that has methods detect_silence and detect_nonsilent that may be useful in your case.
However, the only caveat is that silence needs to be at least half a second.
Below is a sample implementation that I tried using an audio file.
However, since silence in my case was less than half a second, only a few of the silent ranges were correct.
You may want to try this and see if it works for you by tweaking min_silence_len and silence_thresh.
Program
from pydub import AudioSegment, silence
myaudio = AudioSegment.from_wav("a-z-vowels.wav")
silence = silence.detect_silence(myaudio, min_silence_len=1000, silence_thresh=-16)
silence = [((start/1000),(stop/1000)) for start,stop in silence] #convert to sec
print(silence)
Result
[(0, 1), (1, 14), (14, 20), (19, 26), (26, 27), (28, 30), (29, 32), (32, 34), (33, 37), (37, 41), (42, 46), (46, 47), (48, 52)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With