Is there any general-purpose form of short-time Fourier transform with corresponding inverse transform built into SciPy or NumPy or whatever?
There's the pyplot specgram
function in matplotlib, which calls ax.specgram()
, which calls mlab.specgram()
, which calls _spectral_helper()
:
#The checks for if y is x are so that we can use the same function to #implement the core of psd(), csd(), and spectrogram() without doing #extra calculations. We return the unaveraged Pxy, freqs, and t.
but
This is a helper function that implements the commonality between the 204 #psd, csd, and spectrogram. It is NOT meant to be used outside of mlab
I'm not sure if this can be used to do an STFT and ISTFT, though. Is there anything else, or should I translate something like these MATLAB functions?
I know how to write my own ad-hoc implementation; I'm just looking for something full-featured, which can handle different windowing functions (but has a sane default), is fully invertible with COLA windows (istft(stft(x))==x
), tested by multiple people, no off-by-one errors, handles the ends and zero padding well, fast RFFT implementation for real input, etc.
Compute the Short Time Fourier Transform (STFT). STFTs can be used as a way of quantifying the change of a nonstationary signal's frequency and phase content over time.
Inverse short-time Fourier transform.
Here is my Python code, simplified for this answer:
import scipy, pylab def stft(x, fs, framesz, hop): framesamp = int(framesz*fs) hopsamp = int(hop*fs) w = scipy.hanning(framesamp) X = scipy.array([scipy.fft(w*x[i:i+framesamp]) for i in range(0, len(x)-framesamp, hopsamp)]) return X def istft(X, fs, T, hop): x = scipy.zeros(T*fs) framesamp = X.shape[1] hopsamp = int(hop*fs) for n,i in enumerate(range(0, len(x)-framesamp, hopsamp)): x[i:i+framesamp] += scipy.real(scipy.ifft(X[n])) return x
Notes:
blkproc
in Matlab. Instead of a for
loop, I apply a command (e.g., fft
) to each frame of the signal inside a list comprehension, and then scipy.array
casts it to a 2D-array. I use this to make spectrograms, chromagrams, MFCC-grams, and much more.istft
. In order to reconstruct the original signal the sum of the sequential window functions must be constant, preferably equal to unity (1.0). In this case, I've chosen the Hann (or hanning
) window and a 50% overlap which works perfectly. See this discussion for more information.A test:
if __name__ == '__main__': f0 = 440 # Compute the STFT of a 440 Hz sinusoid fs = 8000 # sampled at 8 kHz T = 5 # lasting 5 seconds framesz = 0.050 # with a frame size of 50 milliseconds hop = 0.025 # and hop size of 25 milliseconds. # Create test signal and STFT. t = scipy.linspace(0, T, T*fs, endpoint=False) x = scipy.sin(2*scipy.pi*f0*t) X = stft(x, fs, framesz, hop) # Plot the magnitude spectrogram. pylab.figure() pylab.imshow(scipy.absolute(X.T), origin='lower', aspect='auto', interpolation='nearest') pylab.xlabel('Time') pylab.ylabel('Frequency') pylab.show() # Compute the ISTFT. xhat = istft(X, fs, T, hop) # Plot the input and output signals over 0.1 seconds. T1 = int(0.1*fs) pylab.figure() pylab.plot(t[:T1], x[:T1], t[:T1], xhat[:T1]) pylab.xlabel('Time (seconds)') pylab.figure() pylab.plot(t[-T1:], x[-T1:], t[-T1:], xhat[-T1:]) pylab.xlabel('Time (seconds)')
Here is the STFT code that I use. STFT + ISTFT here gives perfect reconstruction (even for the first frames). I slightly modified the code given here by Steve Tjoa : here the magnitude of the reconstructed signal is the same as that of the input signal.
import scipy, numpy as np def stft(x, fftsize=1024, overlap=4): hop = fftsize / overlap w = scipy.hanning(fftsize+1)[:-1] # better reconstruction with this trick +1)[:-1] return np.array([np.fft.rfft(w*x[i:i+fftsize]) for i in range(0, len(x)-fftsize, hop)]) def istft(X, overlap=4): fftsize=(X.shape[1]-1)*2 hop = fftsize / overlap w = scipy.hanning(fftsize+1)[:-1] x = scipy.zeros(X.shape[0]*hop) wsum = scipy.zeros(X.shape[0]*hop) for n,i in enumerate(range(0, len(x)-fftsize, hop)): x[i:i+fftsize] += scipy.real(np.fft.irfft(X[n])) * w # overlap-add wsum[i:i+fftsize] += w ** 2. pos = wsum != 0 x[pos] /= wsum[pos] return x
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With