Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Joining .wav files without writing on disk in Python

I have a list of .wav files in binary format (they are coming from a websocket), which I want to join in a single binary .wav file to then do speech recognition with it. I have been able to make it work with the following code:

audio = [binary_wav1, binary_wav2,..., binary_wavN] # a list of .wav binary files coming from a socket
audio = [io.BytesIO(x) for x in audio]

# Join wav files
with wave.open('/tmp/input.wav', 'wb') as temp_input:
    params_set = False
    for audio_file in audio:
        with wave.open(audio_file, 'rb') as w:
            if not params_set:
                temp_input.setparams(w.getparams())
                params_set = True
            temp_input.writeframes(w.readframes(w.getnframes()))

# Do speech recognition
binary_audio = open('/tmp/input.wav', 'rb').read())
ASR(binary_audio)

The problem is that I don't want to write the file '/tmp/input.wav' in disk. Is there any way to do it without writing any file in the disk?

Thanks.

like image 337
Iñigo Casanueva Avatar asked May 24 '18 19:05

Iñigo Casanueva


People also ask

How do I play a WAV file in Python?

python-sounddevice In order to play WAV files, numpy and soundfile need to be installed, to open WAV files as NumPy arrays. The line containing sf. read() extracts the raw audio data, as well as the sampling rate of the file as stored in its RIFF header, and sounddevice.


1 Answers

The general solution for having a file but never putting it to disk is a stream. For this we use the io library which is the default library for working with in-memory streams. You even already use BytesIO earlier in your code it seems.

audio = [binary_wav1, binary_wav2,..., binary_wavN] # a list of .wav binary files coming from a socket
audio = [io.BytesIO(x) for x in audio]

# Join wav files

params_set = False
temp_file = io.BytesIO()
with wave.open(temp_file, 'wb') as temp_input:
    for audio_file in audio:
        with wave.open(audio_file, 'rb') as w:
            if not params_set:
                temp_input.setparams(w.getparams())
                params_set = True
            temp_input.writeframes(w.readframes(w.getnframes()))

#move the cursor back to the beginning of the "file"
temp_file.seek(0)
# Do speech recognition
binary_audio = temp_file.read()
ASR(binary_audio)

note I don't have any .wav files to try this out on. It's up to the wave library to handle the difference between real files and buffered streams properly.

like image 51
Aaron Avatar answered Sep 21 '22 12:09

Aaron