Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Am I using the Fourier transformation the right way?

I am wondering if I am using Fourier Transformation in MATLAB the right way. I want to have all the average amplitudes for frequencies in a song. For testing purposes I am using a free mp3 download of Beethovens "For Elise" which I converted to a 8 kHz mono wave file using Audacity.

My MATLAB code is as follows:

clear all % be careful

% load file
% Für Elise Recording by Valentina Lisitsa 
% from http://www.forelise.com/recordings/valentina_lisitsa
% Converted to 8 kHz mono using Audacity
allSamples = wavread('fur_elise_valentina_lisitsa_8khz_mono.wav');


% apply windowing function
w = hanning(length(allSamples));
allSamples = allSamples.*w;


% FFT needs input of length 2^x
NFFT = 2^nextpow2(length(allSamples))


% Apply FFT
fftBuckets=fft(allSamples, NFFT); 
fftBuckets=fftBuckets(1:(NFFT/2+1)); % because of symetric/mirrored values


% calculate single side amplitude spectrum, 
% normalize by dividing by NFFT to get the 
% popular way of displaying amplitudes
% in a range of 0 to 1
fftBuckets = (2*abs(fftBuckets))/NFFT; 

% plot it: max possible frequency is 4000, because sampling rate of input
% is 8000 Hz
x = linspace(1,4000,length(fftBuckets));
bar(x,fftBuckets);

The output then looks like this: enter image description here

  1. Can somebody please tell me if my code is correct? I am especially wondering about the peaks around 0.
  2. For normalizing, do I have to divide by NFFT or length(allSamples)?
  3. For me this doesn't really look like a bar chart, but I guess this is due to the many values I am plotting?

Thanks for any hints!

like image 517
stefan.at.wpf Avatar asked Jul 03 '12 10:07

stefan.at.wpf


1 Answers

  1. Depends on your definition of "correct". This is doing what you intended, I think, but it's probably not very useful. I would suggest using a 2D spectrogram instead, as you'll get time-localized information on frequency content.

  2. There is no one correct way of normalising FFT output; there are various different conventions (see e.g. the discussion here). The comment in your code says that you want a range of 0 to 1; if your input values are in the range -1 to 1, then dividing by number of bins will achieve that.

  3. Well, exactly!

I would also recommend plotting the y-axis on a logarithmic scale (in decibels), as that's roughly how the human ear interprets loudness.

like image 171
Oliver Charlesworth Avatar answered Sep 28 '22 01:09

Oliver Charlesworth