Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

NAudio frequency band intensity

I have an audio player using NAudio and I would like to display a real time intensity for each frequency band.

I have an event triggered for each block of 1024 samples:

public void Update(Complex[] fftResults)
{
   // ??
}

What i would like to have is an array of numbers indicating the intensity of each frequency band. Lets say I would like to divide the window into 16 bands.

For example when there are more bass frequencies it could look like this:

░░░░░░░░░░░░░░░░
▓▓▓░░░░░░░░░░░░░
▓▓▓░░░░░░░░░░░░░
▓▓▓▓░░░░░░░░░░░░
▓▓▓▓▓░░░░░░░░░░░
▓▓▓▓▓▓▓▓░░░▓░░▓░

What should I put into that event handler if this is possible with that data?

Data coming (Complex[]) has already been transformed with the FFT. It is a stereo stream.

First try:

double[] bandIntensity = new double[16] { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };

public void Update(Complex[] fftResults)
{
    // using half fftResults because the others are just mirrored
    int band = 0;
    for (int n = 0; n < fftResults.Length/2; n++)
    {
        band = (int)(.5 * n / fftResults.Length * bandIntensity.Length);
        bandIntensity[band] += Math.Sqrt(fftResults[n].X * fftResults[n].X + fftResults[n].Y * fftResults[n].Y);
        bandIntensity[band] /= 2;
    }
}

The above is doing something but I think too much goes into the first two bands, and I'm playing shakira which does not have that much bass.

Thanks!

like image 389
Marino Šimić Avatar asked Oct 12 '11 03:10

Marino Šimić


1 Answers

There are two separate issues that you probably want to address here:

(1) Window Function

You need to apply a window function to your data prior to the FFT, otherwise you will get spectral leakage which will results in a very smeared spectrum. One unpleasant side effect of spectral leakage is that if you have any kind of significant DC (0 Hz) component then this will result in the kind of 1/f shape that you are seeing on your bar graph.

(2) Log amplitude/frequency axes

Human hearing is essentially logarithmic in both the intensity and frequency axes. Not only that, but speech and music tend to have more energy in the lower frequency part of the spectrum. To get a more pleasing and meaningful display of intensity versus frequency we usually make both the magnitude and frequency axes logarithmic. In the case of the magnitude axis this is normally taken care of by plotting dB re full scale, i.e.

magnitude_dB = 10 * log10(magnitude);

In the case of the frequency axis you will probably want to group your bins into bands, which might each be an octave (2:1 frequency range), or more commonly for higher resolution, third octave. So if you just want 10 "bars" then you might use the following octave bands:

   25 -    50 Hz
   50 -   100 Hz
  100 -   200 Hz
  200 -   400 Hz
  400 -   800 Hz
  800 -  1600 Hz
 1600 -  3200 Hz
 3200 -  6400 Hz
 6400 - 12800 Hz
12800 - 20000 Hz

(assuming you have a 44.1 kHz sample rate and an upper limit on your audio input hardware of 20 kHz).

Note that while having a magnitude (dB) intensity scale is pretty much mandatory for this kind of application, the log frequency axis is less critical, so you could try with your existing linear binning for now, and just see what effect you get from applying a window function in the time domain (assuming you don't already have one) and converting the magnitude scale to dB.

like image 141
Paul R Avatar answered Oct 05 '22 23:10

Paul R