I feel like this is a fairly common problem but I haven't yet found a suitable answer. I have many audio files of human speech that I would like to break on words, which can be done heuristically by looking at pauses in the waveform, but can anyone point me to a function/library in python that does this automatically?
This is a python code snippet that I use for splitting files as per necessity. I use the pydub library from https://github.com/jiaaro/pydub. You can modify the snippet to suit your requirement. from pydub import AudioSegment t1 = t1 * 1000 #Works in milliseconds t2 = t2 * 1000 newAudio = AudioSegment.
An easier way to do this is using pydub module. recent addition of silent utilities does all the heavy lifting such as setting up silence threahold
, setting up silence length
. etc and simplifies code significantly as opposed to other methods mentioned.
Here is an demo implementation , inspiration from here
Setup:
I had a audio file with spoken english letters from A
to Z
in the file "a-z.wav". A sub-directory splitAudio
was created in the current working directory. Upon executing the demo code, the files were split onto 26 separate files with each audio file storing each syllable.
Observations: Some of the syllables were cut off, possibly needing modification of following parameters,min_silence_len=500
silence_thresh=-16
One may want to tune these to one's own requirement.
Demo Code:
from pydub import AudioSegment from pydub.silence import split_on_silence sound_file = AudioSegment.from_wav("a-z.wav") audio_chunks = split_on_silence(sound_file, # must be silent for at least half a second min_silence_len=500, # consider it silent if quieter than -16 dBFS silence_thresh=-16 ) for i, chunk in enumerate(audio_chunks): out_file = ".//splitAudio//chunk{0}.wav".format(i) print "exporting", out_file chunk.export(out_file, format="wav")
Output:
Python 2.7.9 (default, Dec 10 2014, 12:24:55) [MSC v.1500 32 bit (Intel)] on win32 Type "copyright", "credits" or "license()" for more information. >>> ================================ RESTART ================================ >>> exporting .//splitAudio//chunk0.wav exporting .//splitAudio//chunk1.wav exporting .//splitAudio//chunk2.wav exporting .//splitAudio//chunk3.wav exporting .//splitAudio//chunk4.wav exporting .//splitAudio//chunk5.wav exporting .//splitAudio//chunk6.wav exporting .//splitAudio//chunk7.wav exporting .//splitAudio//chunk8.wav exporting .//splitAudio//chunk9.wav exporting .//splitAudio//chunk10.wav exporting .//splitAudio//chunk11.wav exporting .//splitAudio//chunk12.wav exporting .//splitAudio//chunk13.wav exporting .//splitAudio//chunk14.wav exporting .//splitAudio//chunk15.wav exporting .//splitAudio//chunk16.wav exporting .//splitAudio//chunk17.wav exporting .//splitAudio//chunk18.wav exporting .//splitAudio//chunk19.wav exporting .//splitAudio//chunk20.wav exporting .//splitAudio//chunk21.wav exporting .//splitAudio//chunk22.wav exporting .//splitAudio//chunk23.wav exporting .//splitAudio//chunk24.wav exporting .//splitAudio//chunk25.wav exporting .//splitAudio//chunk26.wav >>>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With