Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding FFT in aurioTouch2

Tags:

ios

audio

fft

I've been looking at aurioTouch 2 from Apple' sample code (found here). At the end of the day I want to analyze the frequencies myself. For now I'm trying to understand some of what's going on here. My apologies if this is trivial, just trying to understand some of the uncommented magic numbers floating around in some of the source. My main points of confusion right now are:

  1. Why do they zero out the nyquist value in FFTBufferManager::ComputeFFT? Can this value really just be thrown away? (~line 112 of FFTBufferManager.cpp).
  2. They scale everything down by -128db, so I'm assuming that the results are thus in the range of (-128, 0). However, later in aurioTouchAppDelegate.mm (~line 807), They convert this to a value between 0 and 1 by adding 80 and dividing by 64, then clamping to 0 and 1. Why the fuzziness? Also, am I right in assuming values will be in the vicinity of (-128, 0)?
like image 714
Francisco Ryan Tolmasky I Avatar asked Jan 13 '12 03:01

Francisco Ryan Tolmasky I


1 Answers

Well, it's not trivial for me either but this is how i understand it. If i've over simplified it is purely for my benefit, i don't mean to be patronising.

Zeroing the result corresponding to the Nyquist frequency:

I'm going to suppose we are computing the forward FFT of 1024 input samples. At 44100hz input this is usually true in my case (but isn't what AurioTouch is doing, which i find a bit weird, but i'm no expert). It's easier for me to understand with specific values.

Given 1024 (n) input samples, arranged as needed (even indexes' first then odd indexes' { in[0], in[2], in[4], …, in1, in[3], in[5], … }) (use vDSP_ctoz() to order your input)

The output of FFT 1024 (n) input samples is 513 ((n/2)+1) complex values. ie 513 real components and 513 imaginary components, a total of 1026 values.

However, imaginary[0] and imaginary[512] (n/2) are always, necessarily, zero. So by placing real[512] (the real component of the Nyquist frequency bin) at imaginary[0] and forgetting imaginary[512] - which is always zero and can be inferred, the results are packed into an 1024 (n) length buffer.

So, for the returned results to be valid you must at least set imaginary[0] back to zero. If you require all 513 ((n/2)+1) frequency bins you need to append another complex value to the result and set it thus..

unpackedVal = imaginary[0]
real[512]=unpackedVal, imaginary[512]=0
imaginary[0] = 0

In AurioTouch i always supposed they just don't bother. n/2 results is obviously more convenient to work with and you can hardly tell from the visualizer:- "Oh look, it's missing one magnitude at the Nyquist frequency"

The UsingFourierTransforms docs explain the packing

NB the specific values 1024, 513, 512, etc. are examples not the actual values of n, (n/2)+1, n/2 from AurioTouch.

They scale everything down by -128db

Not quite, the range of the output values is relative to the number of input samples so it has to be normalised. The scale is 1.0/(2*inNumberFrames).

After scaling the range is -1.0 –> +1.0. The magnitude of the complex vector is then taken (the phase is ignored) giving a Scalar value for each frequency bin between 0 and 1.0

This value is then interpreted as a decibel value between -128 and 0

The drawing stuff… +80 / 64. …*120… …i'm not sure. I may be completely wrong or it may be …artistic license?

like image 89
hooleyhoop Avatar answered Nov 18 '22 10:11

hooleyhoop