Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fourier Transform with Matlab

Tags:

matlab

fft

My question is similar to, but more general than this post, and I think there is a mistake in there regarding normalisation, with the newest version of Matlab (2015) anyway. I hesitated to post this on CodeReview SE, if you think it would be more appropriate let me know in the comments.

I would like to validate the following code of a Fourier transform using Matlab's fft, because I have found conflicting sources of information on the web, including in the Matlab help itself, and I have been unable to verify Parseval's theorem with certain such "recipes" (including with answers coming from the MathWorks team, see below), especially those extracting single-sided spectra for real inputs.

For instance, the amplitude-doubling commonly found online to account for the symmetric spectra of real-valued signals when extracting positive frequencies only seems to be wrong (Parseval's theorem fails), and instead it seems to be necessary to use a square-root of two coefficient in Matlab (I don't know why). Some people also seem to normalise the DFT coefficients directly like Y = fft(X)/L, but I think this is confusing and should be discouraged; the amplitude is defined as the modulus of the complex DFT coefficient divided by the signal length, the coefficients themselves should not be divided. Once validated I intend to post this code as a gist on GitHub.

function [frq,amp,phi] = fourier_transform( time, vals )
% FOURIER_TRANSFORM computes the Fast Fourier Transform of a given time-series.
% 
% [freq,amp,phi] = fourier_transform(time,vals)
%
% Inputs:
%
%   time  - Nx1 column vector of equally-spaced timepoints (if not, input will be resampled).
%   vals  - NxM matrix of real- or complex-valued observations at previous timepoints.
%
% Outputs:
%
%   frq   - column vector of frequencies in Hz
%   amp   - corresponding matrix of amplitudes for each frequency (row) and signal (column)
%   phi   - corresponding unwrapped phase for each frequency (row) and signal (column)
%
% Note:
%   To compute the power from the output amplitude, you need to multiply by the number of timepoints:
%       power = numel(time) * amp.^2;
%
% References:
%   https://en.wikipedia.org/wiki/Discrete_Fourier_transform

    % make sure input time-series is uniformly sampled
    assert( iscolumn(time), 'Input time should be a column vector.' );
    assert( ismatrix(vals) && size(vals,1) == numel(time), 'Input values should be a matrix with same number of rows than of timepoints.' );

    if std(diff(time)) > 1e-6
        warning('Input time-course does not appear to be uniformly sampled. Resampling before continuing.');
        [vals,time] = resample(vals,time);
    end

    % sampling information
    nt = numel(time);
    dt = time(2)-time(1);
    fs = 1/dt;
    df = fs/nt;

    % complex spectrum coefficients
    coef = fft(vals);

    % real input
    if isreal(vals)

        % extract one-sided spectrum (spectrum is symmetric)
        nfft = floor( nt/2 + 1 ); % eg 8 -> 5, and 7 -> 4
        coef = coef( 1:nfft, : );
        frq  = (0:nfft-1)*df;

        % correct amplitude values to account for previous extraction
        fac = sqrt(2);
        amp = fac*abs(coef)/nt;
        amp(1,:) = amp(1,:)/fac; % .. except for the DC component
        if mod(nt,2) == 0
            amp(end,:) = amp(end,:)/fac; % .. and for the Nyquist frequency (only if nt is even)
        end

    % complex input
    else

        % shift the spectrum to center frequencies around 0
        coef = fftshift( coef );
        frq  = fftshift( (0:nt-1)*df );
        amp  = abs(coef)/nt;

    end

    % make sure frq is a column vector and compute phases
    frq = frq(:);
    phi = unwrap(angle(coef));

end

Example use

>> fs=1e3; t=transpose(0:1/fs:10); nt=numel(t); X=rand(nt,1);
>> [frq,amp,phi] = fourier_transform( t, X ); 
>> sum( abs(X).^2 ) - nt*sum( amp.^2 ) % Parseval's theorem

ans =

  -2.7285e-11

Wrong example 1

From Matlab's own example, and posted on SO:

>> Fs = 1000;                    % Sampling frequency
T = 1/Fs;                     % Sample time
L = 1000;                     % Length of signal
t = (0:L-1)*T;                % Time vector
x = 0.7*sin(2*pi*50*t) + sin(2*pi*120*t);
y = x + 2*randn(size(t));     % Sinusoids plus noise
NFFT = 2^nextpow2(L); % Next power of 2 from length of y
Y = fft(y,NFFT)/L;
f = Fs/2*linspace(0,1,NFFT/2+1);
>> sum(abs(y).^2) - NFFT*sum(abs(Y).^2) % Parseval's theorem

ans =

 -220.4804

Problem and solution:

Comes from the normalisation in the line Y = fft(y,NFFT)/L. This should be instead:

>> Y = fft(y,NFFT);
Ya = abs(Y)/NFFT; % correctly normalised amplitudes
sum(abs(y).^2) - NFFT*sum(Ya.^2) % Parseval's theorem

Wrong example 2

From the MathWorks team's own clarification post:

>> Fs = 1000;                    % Sampling frequency
T = 1/Fs;                     % Sample time
L = 1000;                     % Length of signal
t = (0:L-1)*T;                % Time vector
% Sum of a 50 Hz sinusoid and a 120 Hz sinusoid
x = 0.7*sin(2*pi*50*t) + sin(2*pi*120*t);
y = x + 2*randn(size(t));     % Sinusoids plus noise

NFFT = 2^nextpow2(L); % Next power of 2 from length of y
Y = fft(y,NFFT)/L;
f = Fs/2*linspace(0,1,NFFT/2+1);

NumUniquePts = ceil((NFFT+1)/2);
Y=Y(1:NumUniquePts);
MX=2*abs(Y);
MX(1)=MX(1)/2; % DC component
if ~rem(NFFT,2)  % when NFFT is even, Y(1+Y/2) is the Nyquist frequency component of x, and needs to make sure it is unique.
    MX(length(MX))=MX(length(MX))/2;
end

>> sum( abs(y).^2 ) - NFFT*sum( MX.^2 )

ans =

  -5.3812e+03

Problem and solution:

Normalisation again. Replace Y = fft(y,NFFT)/L; by Y = fft(y,NFFT), and supposedly MX=2*abs(Y); by MX=2*abs(Y)/NFFT;. But here the amplitude doubling problem appears; the correction factor seems to be sqrt(2) and not 2.

Wrong example 3

Found as an answer on MatlabCentral:

>> Fs = 1000;                    % Sampling frequency
T = 1/Fs;                     % Sample time
L = 1000;                     % Length of signal
t = (0:L-1)*T;                % Time vector
x = 0.7*sin(2*pi*Fs/8*t) + sin(2*pi*Fs/4*t); 
NFFT = 2^nextpow2(L); % Next power of 2 from length of y
Y = fft(x,NFFT)/L;
f = Fs/2*linspace(0,1,NFFT/2+1);
>> sum( abs(x).^2 ) - NFFT*sum( abs(Y).^2 )

ans =

  -36.1891

Problem and solution:

As in the first example, normalisation problem. Write instead:

Y  = fft(x,NFFT);
Ya = abs(Y)/NFFT;
sum( abs(x).^2 ) - NFFT*sum( abs(Ya).^2 )
like image 247
Jonathan H Avatar asked Feb 03 '16 12:02

Jonathan H


1 Answers

TL;DR (summary)

It is very hard to find online examples of fft usage with Matlab that normalise the amplitude/power values correctly (e.g. as can be verified with Parseval's theorem). This is crucial if you want to compare spectra between signals with different lengths. There is also an additional problem with real-valued signals because in that case the spectrum is often computed for positive frequency only, and therefore amplitude or power values need to be scaled to account for frequency-folding. Following the post and answers below, here is a gist which I think scales the coeffcients correctly and consistently for real- and complex-valued inputs.

The take home messages are:

  • Do NOT normalise the DFT coefficients directly (eg don't write Y = fft(x)/L);
  • If you use an n-points transform (eg fft(x,nfft)), then the normaliser is nfft and not numel(x);
  • If you extract a single-sided spectrum, you need to adjust amplitude/power values depending on which correspond to conjugate pairs DFT coefficients;
  • If you extract a single-sided spectrum, you should compute the amplitude and power separately (ie don't compute the power from the amplitudes).

Amplitude, power and single-sided spectra

As defined and explained on Wikipedia:

  • The DFT coefficients are complex and not normalised, while the formula for the inverse DFT carries a 1/N factor in front of the sum. This is natural in some sense, as moving in time-to-frequency direction can be seen as a projection onto a basis of (orthogonal) waves with different frequencies, whereas moving in frequency-to-time direction can be seen as a weighted superposition of waves.
  • In that superposition, the overall magnitude should be that of the original time-point (ie, it's an inversion), and the amplitude of each wave in that weighted average is simply the magnitude of the corresponding DFT coefficient divided by the number of waves |Xk| / N. Similarly, the power of each wave is |Xk|^2 / N. Matlab uses that normalisation too (well, FFTW does).
  • For real-valued inputs, the DFT coefficients are conjugate pairs for corresponding positive/negative frequencies, apart from the DC component (average term, frequency 0), and for the Nyquist frequency when the number of time-points is even. In practice, most applications avoid this redundancy by extracting the DFT coefficients only for positive frequencies. This leads to complications in the discrete values of the amplitude and power.
  • The amplitudes corresponding to paired DFT coefficients (all except the first one and the Nyquist frequency when it exists) can simply be doubled and the negative frequency then discarded. Same for the power.
  • Similarly for the power, but note that it is incorrect in that case to compute the discrete power values using the adjusted amplitude values. Indeed N * amp_adjusted[k]^2 = N * (2*|Xk|/N)^2 is not equal to 2*|Xk|^2 / N (this is where the square-root of two came from in the OP). Therefore it is necessary to compute independently the amplitude and power values from the DFT coefficients (another good reason not to scale them).

N-points Transform

Many of the examples online use an explicit N-points transform: Y = fft(x,NFFT) where NFFT is typically a power of 2, making the computation more efficient with FFTW.

The effective difference in that case (provided that NFFT >= N) is that x is padded with 0 at its end until it reaches the length of NFFT time-points. This means that the number of frequencies in the decomposition changes, and therefore the normalisation should be done relative to NFFT wave components, and not the original N time-points.

Hence, almost all of the examples found online are wrong in the way they normalise the coefficients. It should not be Y = fft(x,NFFT)/N, but Y = fft(x,NFFT)/NFFT -- yet another good reason to loose that habit of normalising complex coefficients.

Note that this makes no difference then to Parseval's equality, because the added terms in the time-domain are all zero, and therefore their contribution to the now-larger sum is zero too. In the frequency domain though, the added discrete frequencies will have a response to the original signal in general, which gives an intuitive sense of why the obtained coefficients can be quite different indeed between the padded and unpadded transforms.

Code

The code in the OP is therefore incorrect, and instead it appears to be necessary to output both the amplitude and power, as there is no general normalisation coefficient that could accommodate the complex and real cases, with even or odd number of time-points. You can find the Gist here.

like image 59
Jonathan H Avatar answered Nov 05 '22 07:11

Jonathan H