So, I'm trying to use the Python Wave module to get an audio file and basically get all of the frames from it, examine them, and then write them back to another file. I tried to output the sound that I'm reading to another file just now, but it came out either as noise, or as no sound at all. So, I'm pretty sure that I'm not analyzing the file and getting the correct frames...? I'm dealing with a stereo 16-bit sound file. While I could use a simpler file to just understand the process, I eventually want to be able to accept any kind of sound file to work with, so I need to understand what the problem is.
I also noted that 32-bit sound files wouldn't be read by the Wave module - it gave me an error of "Unknown Format". Any ideas about that? Is it something I can bypass so that I could at least, for example, read 32-bit audio files, even if I can only 'render' 16-bit files?
I'm somewhat aware that wave files are interleaved between the left and right channels (first sample's for the left channel, second's for the right, etc)., but how do I separate the channels? Here's my code. I cut out the output code to just see if I'm reading the files correctly. I'm using Python 2.7.2:
import scipy
import wave
import struct
import numpy
import pylab
fp = wave.open('./sinewave16.wav', 'rb') # Problem loading certain kinds of wave files in binary?
samplerate = fp.getframerate()
totalsamples = fp.getnframes()
fft_length = 2048 # Guess
num_fft = (totalsamples / fft_length) - 2
temp = numpy.zeros((num_fft, fft_length), float)
leftchannel = numpy.zeros((num_fft, fft_length), float)
rightchannel = numpy.zeros((num_fft, fft_length), float)
for i in range(num_fft):
tempb = fp.readframes(fft_length / fp.getnchannels() / fp.getsampwidth());
#tempb = fp.readframes(fft_length)
up = (struct.unpack("%dB"%(fft_length), tempb))
#up = (struct.unpack("%dB"%(fft_length * fp.getnchannels() * fp.getsampwidth()), tempb))
#print (len(up))
temp[i,:] = numpy.array(up, float) - 128.0
temp = temp * numpy.hamming(fft_length)
temp.shape = (-1, fp.getnchannels())
fftd = numpy.fft.rfft(temp)
pylab.plot(abs(fftd[:,1]))
pylab.show()
#Frequency of an FFT should be as follows:
#The first bin in the FFT is DC (0 Hz), the second bin is Fs / N, where Fs is the sample rate and N is the size of the FFT. The next bin is 2 * Fs / N. To express this in general terms, the nth bin is n * Fs / N.
# (It would appear to me that n * Fs / N gives you the hertz, and you can use sqrt(real portion of number*r + imaginary portion*i) to find the magnitude of the signal
Currently, this will load the sound file, unpack it into a struct, and plot the sound file so that I can look at it, but I don't think it's getting all of the audio file, or it's not getting it correctly. Am I reading the wave file into the struct correctly? Are there any up-to-date resources on using Python to read and analyze wave / audio files? Any help would be greatly appreciated.
Perhaps you should try the SciPy io.wavefile module:
http://docs.scipy.org/doc/scipy/reference/io.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With