I saw someone do this in a presentation but I'm having a hard time reproducing what he was able to do. Here's a slide from his presentation:
Pretty cool. He decomposed a dataset using FFT, then plotted the appropriate sine waves that the FFT specified.
So in an effort to recreate what he did, I created a series of points that correspond to the combination of 2 sine waves:
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
x = np.arange(0, 10, 0.01)
x2 = np.arange(0, 20, 0.02)
sin1 = np.sin(x)
sin2 = np.sin(x2)
x2 /= 2
sin3 = sin1 + sin2
plt.plot(x, sin3)
plt.show()
Now I want to decompose this wave (or rather, the wave that the points imply) back into the original 2 sine waves:
# goal: sin3 -> sin1, sin2
# sin3
array([ 0.00000000e+00, 2.99985000e-02, ... 3.68998236e-01])
# sin1
array([ 0. , 0.00999983, 0.01999867, ... -0.53560333])
# sin2
array([ 0. , 0.01999867, 0.03998933, ... 0.90460157])
I start by importing numpy
and getting the fft
of sin3
:
import numpy as np
fft3 = np.fft.fft(sin3)
ok, that's about as far as I get. Now I've got an array with complex numbers:
array([ 2.13316069e+02+0.00000000e+00j, 3.36520138e+02+4.05677438e+01j,...])
and if I naively plot it I see:
plt.plot(fft3)
plt.show()
Ok, not sure what to do with that.
I want to get from here to the datasets that look like sin1 and sin2:
plt.plot(sin1)
plt.show()
plt.plot(sin2)
plt.show()
I understand the real and imaginary part of the complex numbers in the fft3
dataset, I'm just not sure what to do with them to derive sin1
and sin2
datasets from it.
I know this has less to do with programming and more to do with math, but could anyone give me a hint here?
EDIT: update on Mark Snyder's answer:
Using Mark's code I was able to get what I expected and ended up with this method:
def decompose_fft(data: list, threshold: float = 0.0):
fft3 = np.fft.fft(data)
x = np.arange(0, 10, 10 / len(data))
freqs = np.fft.fftfreq(len(x), .01)
recomb = np.zeros((len(x),))
for i in range(len(fft3)):
if abs(fft3[i]) / len(x) > threshold:
sinewave = (
1
/ len(x)
* (
fft3[i].real
* np.cos(freqs[i] * 2 * np.pi * x)
- fft3[i].imag
* np.sin(freqs[i] * 2 * np.pi * x)))
recomb += sinewave
plt.plot(x, sinewave)
plt.show()
plt.plot(x, recomb, x, data)
plt.show()
later I'll make it return the recombined list of waves, but for now I'm getting an anomalie I don't quite understand. First of all I call it like this, simply passing in a dataset.
decompose_fft(sin3, threshold=0.0)
But looks great but I get this weird line at y=0.2
Does anyone know what this could be or what's causing it?
EDIT:
The above question has been answered by Mark in the comments, thanks!
The discrete Fourier transform gives you the coefficients of complex exponentials that, when summed together, produce the original discrete signal. In particular, the k'th Fourier coefficient gives you information about the amplitude of the sinusoid that has k cycles over the given number of samples.
Note that since your sines don't have integer numbers of cycles in 1000 samples, you actually will not be able to retrieve your original sine waves using an FFT. Instead you'll get a blend of many different sinusoids, including a constant component of ~.4.
You can plot the various component sinusoids and observe that their sum is the original signal using the following code:
freqs = np.fft.fftfreq(len(x),.01)
threshold = 0.0
recomb = np.zeros((len(x),))
for i in range(len(fft3)):
if abs(fft3[i])/(len(x)) > threshold:
recomb += 1/(len(x))*(fft3[i].real*np.cos(freqs[i]*2*np.pi*x)-fft3[i].imag*np.sin(freqs[i]*2*np.pi*x))
plt.plot(x,1/(len(x))*(fft3[i].real*np.cos(freqs[i]*2*np.pi*x)-fft3[i].imag*np.sin(freqs[i]*2*np.pi*x)))
plt.show()
plt.plot(x,recomb,x,sin3)
plt.show()
By changing threshold
, you can also choose to exclude sinusoids of low power and see how that affects the final reconstruction.
EDIT: There's a bit of a trap in the above code, although it's not wrong. It hides the inherent symmetry of the DFT for real signals, and plots each of the sinusoids twice at half of their true amplitude. This code is more performant and plots the sinusoids at their correct amplitude:
freqs = np.fft.fftfreq(len(x),.01)
threshold = 0.0
recomb = np.zeros((len(x),))
middle = len(x)//2 + 1
for i in range(middle):
if abs(fft3[i])/(len(x)) > threshold:
if i == 0:
coeff = 2
else:
coeff = 1
sinusoid = 1/(len(x)*coeff/2)*(abs(fft3[i])*np.cos(freqs[i]*2*np.pi*x+cmath.phase(fft3[i])))
recomb += sinusoid
plt.plot(x,sinusoid)
plt.show()
plt.plot(x,recomb,x,sin3)
plt.show()
If in the general case you know that the signal was composed of some subset of sinusoids with frequencies that may not line up correctly with the length of the signal, you may be able to identify the frequencies by either zero-padding or extending your signal. You can learn more about that here. If the signals are completely arbitrary and you're just interested in looking at component sinusoids, there's no need for that.
There are some issues with discrete Fourier transform that are not immediately obvious from playing with its continuous counterpart. For one thing, the periodicity of your input should match the range of your data, so it's going to be much easier if you use:
x = np.linspace(0, 4*np.pi, 200)
You can then follow your original idea:
sin1 = np.sin(x)
sin2 = np.sin(2*x)
sin3 = sin1 + sin2
fft3 = np.fft.fft(sin3)
Since in FFT sin
goes directly into the imaginary component, you can try plotting only the imaginary part:
plt.plot(fft3.imag)
plt.show()
What you should see will be peaks centered at x=2
and x=4
that correspond to the original sinusoidal components, which had frequencies of "2 per signal" (sin(x) from 0 to 4 pi) and "4 per signal" (sin(2x) from 0 to 4 pi).
To plot all individual components, you can go with:
for i in range(1,100):
plt.plot(x, fft3.imag[i] * np.sin(i*x)/100)
plt.show()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With