Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Recreating time series data using FFT results without using ifft

I analyzed the sunspots.dat data (below) using fft which is a classic example in this area. I obtained results from fft in real and imaginery parts. Then I tried to use these coefficients (first 20) to recreate the data following the formula for Fourier transform. Thinking real parts correspond to a_n and imaginery to b_n, I have

import numpy as np
from scipy import *
from matplotlib import pyplot as gplt
from scipy import fftpack

def f(Y,x):
    total = 0
    for i in range(20):
        total += Y.real[i]*np.cos(i*x) + Y.imag[i]*np.sin(i*x)
    return total

tempdata = np.loadtxt("sunspots.dat")

year=tempdata[:,0]
wolfer=tempdata[:,1]

Y=fft(wolfer)
n=len(Y)
print n

xs = linspace(0, 2*pi,1000)
gplt.plot(xs, [f(Y, x) for x in xs], '.')
gplt.show()       

For some reason however, my plot does not mirror the one generated by ifft (I use the same number of coefficients on both sides). What could be wrong ?

Data:

http://linuxgazette.net/115/misc/andreasen/sunspots.dat

like image 261
BBSysDyn Avatar asked Dec 15 '10 15:12

BBSysDyn


1 Answers

When you called fft(wolfer), you told the transform to assume a fundamental period equal to the length of the data. To reconstruct the data, you have to use basis functions of the same fundamental period = 2*pi/N. By the same token, your time index xs has to range over the time samples of the original signal.

Another mistake was in forgetting to do to the full complex multiplication. It's easier to think of this as Y[omega]*exp(1j*n*omega/N).

Here's the fixed code. Note I renamed i to ctr to avoid confusion with sqrt(-1), and n to N to follow the usual signal processing convention of using the lower case for a sample, and the upper case for total sample length. I also imported __future__ division to avoid confusion about integer division.

forgot to add earlier: Note that SciPy's fft doesn't divide by N after accumulating. I didn't divide this out before using Y[n]; you should if you want to get back the same numbers, rather than just seeing the same shape.

And finally, note that I am summing over the full range of frequency coefficients. When I plotted np.abs(Y), it looked like there were significant values in the upper frequencies, at least until sample 70 or so. I figured it would be easier to understand the result by summing over the full range, seeing the correct result, then paring back coefficients and seeing what happens.

from __future__ import division
import numpy as np
from scipy import *
from matplotlib import pyplot as gplt
from scipy import fftpack

def f(Y,x, N):
    total = 0
    for ctr in range(len(Y)):
        total += Y[ctr] * (np.cos(x*ctr*2*np.pi/N) + 1j*np.sin(x*ctr*2*np.pi/N))
    return real(total)

tempdata = np.loadtxt("sunspots.dat")

year=tempdata[:,0]
wolfer=tempdata[:,1]

Y=fft(wolfer)
N=len(Y)
print(N)

xs = range(N)
gplt.plot(xs, [f(Y, x, N) for x in xs])
gplt.show()
like image 75
mtrw Avatar answered Nov 14 '22 23:11

mtrw