Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Short Time Fourier Transform in python

Tags:

python

fft

I want to get frequency with maximum power for every moment in wav file. So I wrote STFT in Python using fft from scipy. I used kaiser window function from scipy. Everything looking great, but my output looks strange. It has some very small numbers and some very high.

here is the output for one wav file: http://pastebin.com/5Ryd2uXj and here is the code in python:

import scipy, pylab
import wave
import struct
import sys

def stft(data, cp, do, hop):
    dos = int(do*cp)
    w = scipy.kaiser(dos,12) //12 is very high for kaiser window
    temp=[]
    wyn=[]
    for i in range(0, len(data)-dos, hop):
        temp=scipy.fft(w*data[i:i+dos])
        max=-1
        for j in range(0, len(temp),1):
            licz=temp[j].real**2+temp[j].imag**2
            if( licz>max ):
                max = licz
                maxj = j
        wyn.append(maxj)
    #wyn = scipy.array([scipy.fft(w*data[i:i+dos])
        #for i in range(0, len(data)-dos, 1)])
    return wyn

file = wave.open(sys.argv[1])
bity = file.readframes(file.getnframes())
data=struct.unpack('{n}h'.format(n=file.getnframes()), bity)
file.close()

cp=44100 #sampling frequency
do=0.05 #window size
hop = 5

wyn=stft(data,cp,do,hop)
print len(wyn)
for i in range(0, len(wyn), 1):
    print wyn[i]
like image 935
user1226419 Avatar asked Feb 22 '12 17:02

user1226419


1 Answers

The actual FT of a sine wave is a pair of delta functions equidistant from 0-frequency. With a discrete function (samples), this is repeated every fs (sampling rate) in the frequency domain. Small errors in FFT computation will mean these two deltas (FT of your sine wave) will not be exactly the same height, so your algorithm is simply picking the taller one.

The scipy FFT function will give you frequency components with the domain [0, fs]. Since (as I mentioned above) this is periodic, these values could also be remapped as [-fs/2, fs/2] by swapping the result at the center point - look into using fftshift to do this. It sounds like you may only be interested in the positive frequencies, however, so you can simply discard the second half of the result of your FFT.

From the notes of scipy.fftpack.fft:

The packing of the result is “standard”: If A = fft(a, n), then A[0] contains the zero-frequency term, A[1:n/2+1] contains the positive-frequency terms, and A[n/2+1:] contains the negative-frequency terms, in order of decreasingly negative frequency. So for an 8-point transform, the frequencies of the result are [ 0, 1, 2, 3, 4, -3, -2, -1].

like image 151
aganders3 Avatar answered Sep 18 '22 04:09

aganders3