Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Wave Python Module to Get and Write Audio

Tags:

python

audio

So, I'm trying to use the Python Wave module to get an audio file and basically get all of the frames from it, examine them, and then write them back to another file. I tried to output the sound that I'm reading to another file just now, but it came out either as noise, or as no sound at all. So, I'm pretty sure that I'm not analyzing the file and getting the correct frames...? I'm dealing with a stereo 16-bit sound file. While I could use a simpler file to just understand the process, I eventually want to be able to accept any kind of sound file to work with, so I need to understand what the problem is.

I also noted that 32-bit sound files wouldn't be read by the Wave module - it gave me an error of "Unknown Format". Any ideas about that? Is it something I can bypass so that I could at least, for example, read 32-bit audio files, even if I can only 'render' 16-bit files?

I'm somewhat aware that wave files are interleaved between the left and right channels (first sample's for the left channel, second's for the right, etc)., but how do I separate the channels? Here's my code. I cut out the output code to just see if I'm reading the files correctly. I'm using Python 2.7.2:

import scipy
import wave
import struct
import numpy
import pylab

fp = wave.open('./sinewave16.wav', 'rb') # Problem loading certain kinds of wave files in binary?

samplerate = fp.getframerate()
totalsamples = fp.getnframes()
fft_length = 2048 # Guess
num_fft = (totalsamples / fft_length) - 2

temp = numpy.zeros((num_fft, fft_length), float)

leftchannel = numpy.zeros((num_fft, fft_length), float)
rightchannel = numpy.zeros((num_fft, fft_length), float)

for i in range(num_fft):

    tempb = fp.readframes(fft_length / fp.getnchannels() / fp.getsampwidth());

    #tempb = fp.readframes(fft_length)

    up = (struct.unpack("%dB"%(fft_length), tempb))

    #up = (struct.unpack("%dB"%(fft_length * fp.getnchannels() * fp.getsampwidth()), tempb))
    #print (len(up))
    temp[i,:] = numpy.array(up, float) - 128.0


temp = temp * numpy.hamming(fft_length)

temp.shape = (-1, fp.getnchannels())

fftd = numpy.fft.rfft(temp)

pylab.plot(abs(fftd[:,1]))

pylab.show()

#Frequency of an FFT should be as follows:

#The first bin in the FFT is DC (0 Hz), the second bin is Fs / N, where Fs is the sample rate and N is the size of the FFT. The next bin is 2 * Fs / N. To express this in general terms, the nth bin is n * Fs / N.
# (It would appear to me that n * Fs / N gives you the hertz, and you can use sqrt(real portion of number*r + imaginary portion*i) to find the magnitude of the signal

Currently, this will load the sound file, unpack it into a struct, and plot the sound file so that I can look at it, but I don't think it's getting all of the audio file, or it's not getting it correctly. Am I reading the wave file into the struct correctly? Are there any up-to-date resources on using Python to read and analyze wave / audio files? Any help would be greatly appreciated.

like image 223
SolarLune Avatar asked Mar 26 '12 06:03

SolarLune


1 Answers

Perhaps you should try the SciPy io.wavefile module:

http://docs.scipy.org/doc/scipy/reference/io.html

like image 129
steveha Avatar answered Sep 25 '22 12:09

steveha