Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to perform FFT on WAV file data?

Tags:

c++

c

audio

fft

wav

I'm trying to analyse the audio quality of a file by detecting the highest frequency present (compressed audio will generally be filtered to something less than 20KHz).

I'm reading WAV file data using a class from the soundstretch library which returns PCM samples as floats, then performing FFT on those samples with the fftw3 library. Then for each frequency (rounded to the nearest KHz), I am totalling up the amplitude for that frequency.

So for a low quality file that doesn't contain frequencies above 16KHz, I would expect there to be none or very little amplitude above 16KHz, however I'm not getting the results I would expect. Below is my code:

#include <iostream>
#include <math.h>

#include <fftw3.h>
#include <soundtouch/SoundTouch.h>
#include "include/WavFile.h"

using namespace std;
using namespace soundtouch;

#define BUFF_SIZE           6720
#define MAX_FREQ            22//KHz

static float freqMagnitude[MAX_FREQ];

static void calculateFrequencies(fftw_complex *data, size_t len, int Fs) {
    for (int i = 0; i < len; i++) {
        int re, im;
        float freq, magnitude;
        int index;

        re = data[i][0];
        im = data[i][1];

        magnitude = sqrt(re * re + im * im);
        freq = i * Fs / len;

        index = freq / 1000;//round(freq);
        if (index <= MAX_FREQ) {
            freqMagnitude[index] += magnitude;
        }
    }
}

int main(int argc, char *argv[]) {
    if (argc < 2) {
        cout << "Incorrect args" << endl;
        return -1;
    }

    SAMPLETYPE sampleBuffer[BUFF_SIZE];
    WavInFile inFile(argv[1]);

    fftw_complex *in, *out;
    fftw_plan p;

    in = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * BUFF_SIZE);
    out = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * BUFF_SIZE);

    p = fftw_plan_dft_1d(BUFF_SIZE, in, out, FFTW_FORWARD, FFTW_ESTIMATE);

    while (inFile.eof() == 0) {
        size_t samplesRead = inFile.read(sampleBuffer, BUFF_SIZE);

        for (int i = 0; i < BUFF_SIZE; i++) {
            in[i][0] = (double) sampleBuffer[i];
        }

        fftw_execute(p); /* repeat as needed */

        calculateFrequencies(out, samplesRead, inFile.getSampleRate());
    }

    for (int i = 0; i < MAX_FREQ; i += 2) {
        cout << i << "KHz magnitude: " << freqMagnitude[i] << std::endl;
    }

    fftw_destroy_plan(p);
    fftw_free(in);
    fftw_free(out);
 }

Can compile with: - (you'll need the soundtouch library and fftw3 library)

g++ -g -Wall MP3.cpp include/WavFile.cpp -lfftw3 -lm -lsoundtouch -I/usr/local/include -L/usr/local/lib

And here is the spectral analysis of the file I am testing on:

Spek screenshot

As you can see it's clipped at 16KHz, however my results are as follows:

0KHz magnitude: 4.61044e+07
2KHz magnitude: 5.26959e+06
4KHz magnitude: 4.68766e+06
6KHz magnitude: 4.12703e+06
8KHz magnitude: 12239.6
10KHz magnitude: 456
12KHz magnitude: 3
14KHz magnitude: 650468
16KHz magnitude: 1.83266e+06
18KHz magnitude: 1.40232e+06
20KHz magnitude: 1.1477e+06

I would expect there to be no amplitude over 16KHz, am I doing this right? Is my calculation for frequency correct? (I robbed it off another stackoverflow answer) Could it be something to do with there being 2 channels and I'm not separating the channels?

Cheers for any help guys.

like image 907
Jack Farrelly Avatar asked Sep 25 '22 08:09

Jack Farrelly


2 Answers

You are likely measuring the interleave difference between two stereo channels, which can include high frequencies due to unequal mix and pan. Try again with the channels separated or mixed down to mono, and use a smooth window function to reduce FFT aperture edge artifacts, which can also introduce a small amount of high frequency noise due to your rectangular window.

like image 84
hotpaw2 Avatar answered Sep 27 '22 21:09

hotpaw2


An FFT foundamental requirement is the equally time spacing of samples and their congruence.
In your case a stereo signal supply to the FFT algorithm double the number of samples uncorrelated between themself. What is mathematically seen is the natural phase difference between the two cannels, but, more important, two samples that, because unrelated, can have such a big difference to wrongly represent a square wave (in the time domain it would be represented by an extremely high signal slew rate).
As a solution you have to separate the two channels and perform FFT on one single series of samples, or two different FFT.
I don't think that there could be any aliasing problem because this is normally related to the sampling process and performed using analog filter having bandpass frequency < 1/2 the sampling frequence (Nyquist or antialias filter). If this filtering is missed there are almost no way to remove ghosts (alias spectrum) after.

like image 43
Frankie_C Avatar answered Sep 27 '22 20:09

Frankie_C