The Web Audio API has an analyser node which allows you to get FFT data on the audio you're working with and has byte and float ways of getting the data. The byte version makes a bit of sense, returning what looks like a normalized (depending on min and max decibel values) intensity spectrum with 0 being no component of the audio at a specific frequency bin and 255 being the max.
But I'd like a bit more detail than 8 bit, using the float version however, gives weird results.
freqData = new Float32Array(analyser.frequencyBinCount);
analyser.getFloatFrequencyData(freqData);
This gives me values between -891.048828125 and 0. -891 shows up corresponding to silence, so it's somehow the minimum value while I'm guessing 0 is equivalent to the max value.
What's going on? Why is -891.048828125 significant at all? Why a large negative being silence and zero being maximum? Am I using the wrong FloatArray or is there misconfiguration? Float64 gives 0 values.
Since there seems to be zero documentation on what the data actually represents, I looked into the relevant source code of webkit: RealtimeAnalyser.cpp
Short answer: subtract analyser.minDecibels from every value of the Float32Array to get positive numbers and multiply with (analyser.maxDecibels - analyser.minDecibels) to get a similar representation as with getByteFrequencyData, just with more resolution.
Long answer:
Both getByteFrequencyData and getFloatFrequencyData give you the magnitude in decibels. It's just scaled differently and for getByteFrequencyData a minDecibels constant is subtracted:
Relevant code in webkit for getByteFrequencyData:
const double rangeScaleFactor = m_maxDecibels == m_minDecibels ? 1 : 1 / (m_maxDecibels - m_minDecibels);
float linearValue = source[i];
double dbMag = !linearValue ? minDecibels : AudioUtilities::linearToDecibels(linearValue);
// The range m_minDecibels to m_maxDecibels will be scaled to byte values from 0 to UCHAR_MAX.
double scaledValue = UCHAR_MAX * (dbMag - minDecibels) * rangeScaleFactor;
Relevant code in webkit for getFloatFrequencyData:
float linearValue = source[i];
double dbMag = !linearValue ? minDecibels : AudioUtilities::linearToDecibels(linearValue);
destination[i] = float(dbMag);
So, to get positive values, you can simply subtract minDecibels yourself, which is exposed in the analyzer node:
//The minimum power value in the scaling range for the FFT analysis data for conversion to unsigned byte values.
attribute double minDecibels;
Another detail is that by default, the analyser node does time smoothing, which can be disabled by setting smoothingTimeConstant to zero.
The default values in webkit are:
const double RealtimeAnalyser::DefaultSmoothingTimeConstant = 0.8;
const double RealtimeAnalyser::DefaultMinDecibels = -100;
const double RealtimeAnalyser::DefaultMaxDecibels = -30;
Sadly, even though the analyser node computes a complex fft, it doesn't give access to the complex representations, just the magnitudes of it.
Correct on both points in the previous answer and comments - the numbers are in decibels, so 0 is max and -infinity is min (absolute silence). -891.0... is, I believe, just a floating point conversion oddity.
You are correct in using a Float32Array. I found an interesting tutorial on using the Audio Data API, which while it is different than the Web Audio API, gave me some useful insight to me about what you are trying to do here. I had a quick peek to see about why the numbers are negative, and didn't notice anything obvious, but I wondered if these numbers might be in decibels, dB, which commonly is given in negative numbers, and zero is the peak. The only problem with that theory is that -891 seems to be a really small number for dB.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With