I have an audio player using NAudio and I would like to display a real time intensity for each frequency band.
I have an event triggered for each block of 1024 samples:
public void Update(Complex[] fftResults)
{
// ??
}
What i would like to have is an array of numbers indicating the intensity of each frequency band. Lets say I would like to divide the window into 16 bands.
For example when there are more bass frequencies it could look like this:
░░░░░░░░░░░░░░░░
▓▓▓░░░░░░░░░░░░░
▓▓▓░░░░░░░░░░░░░
▓▓▓▓░░░░░░░░░░░░
▓▓▓▓▓░░░░░░░░░░░
▓▓▓▓▓▓▓▓░░░▓░░▓░
What should I put into that event handler if this is possible with that data?
Data coming (Complex[]) has already been transformed with the FFT. It is a stereo stream.
First try:
double[] bandIntensity = new double[16] { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
public void Update(Complex[] fftResults)
{
// using half fftResults because the others are just mirrored
int band = 0;
for (int n = 0; n < fftResults.Length/2; n++)
{
band = (int)(.5 * n / fftResults.Length * bandIntensity.Length);
bandIntensity[band] += Math.Sqrt(fftResults[n].X * fftResults[n].X + fftResults[n].Y * fftResults[n].Y);
bandIntensity[band] /= 2;
}
}
The above is doing something but I think too much goes into the first two bands, and I'm playing shakira which does not have that much bass.
Thanks!
There are two separate issues that you probably want to address here:
(1) Window Function
You need to apply a window function to your data prior to the FFT, otherwise you will get spectral leakage which will results in a very smeared spectrum. One unpleasant side effect of spectral leakage is that if you have any kind of significant DC (0 Hz) component then this will result in the kind of 1/f shape that you are seeing on your bar graph.
(2) Log amplitude/frequency axes
Human hearing is essentially logarithmic in both the intensity and frequency axes. Not only that, but speech and music tend to have more energy in the lower frequency part of the spectrum. To get a more pleasing and meaningful display of intensity versus frequency we usually make both the magnitude and frequency axes logarithmic. In the case of the magnitude axis this is normally taken care of by plotting dB re full scale, i.e.
magnitude_dB = 10 * log10(magnitude);
In the case of the frequency axis you will probably want to group your bins into bands, which might each be an octave (2:1 frequency range), or more commonly for higher resolution, third octave. So if you just want 10 "bars" then you might use the following octave bands:
25 - 50 Hz
50 - 100 Hz
100 - 200 Hz
200 - 400 Hz
400 - 800 Hz
800 - 1600 Hz
1600 - 3200 Hz
3200 - 6400 Hz
6400 - 12800 Hz
12800 - 20000 Hz
(assuming you have a 44.1 kHz sample rate and an upper limit on your audio input hardware of 20 kHz).
Note that while having a magnitude (dB) intensity scale is pretty much mandatory for this kind of application, the log frequency axis is less critical, so you could try with your existing linear binning for now, and just see what effect you get from applying a window function in the time domain (assuming you don't already have one) and converting the magnitude scale to dB.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With