Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Android 2.3 Visualizer - Trouble understanding getFft()

First time here so sorry in advance for any butchered formatting.

So I am completely new to DSP so I have only a very general understanding of the Fourier Transform. I am trying to build a visualizer app for Android SDK 9, which includes a Visualizer class in android.media.audiofx.Visualizer http://developer.android.com/reference/android/media/audiofx/Visualizer.html

The javadoc for the method getFft(), which is what I am using states:

"Returns a frequency capture of currently playing audio content. The capture is a 8-bit magnitude FFT. Note that the size of the FFT is half of the specified capture size but both sides of the spectrum are returned yielding in a number of bytes equal to the capture size."

First of all, what does "both sides of the spectrum" mean? How does this output differ from a standard FFT?

Here is some sample output of the byte array, getFft() was given 124 points to keep it simple and I grabbed the first 31 bins. Here are the magnitudes of the first 31 bins:

{123, -2, -23, -3, 6, -16, 15, -10, -8, -12, 9, -9, 17, -6, -18, -22, -8, 4, -5, -2, 10, -3, -11, 3, -4, -11, -8, 15, 16, 11, -12, 12}

Any help or explanation would be greatly appreciated!

Edit: So after staring at a bunch of graphs it looks like part of my problem is Google does not specify what unit is being used. Almost all other measurements are done in mHz, would it be fair to assume that the FTT output is also in mHz? Is there a place where I can see the source code of the Visualizer class so maybe I can figure out what the hell is actually going on under the hood?

I went ahead and grabbed all of the output of getFft()

93, -2, -28, -16, -21, 19, 44, -16, 3, 16, -9, -4, 0, -2, 21, 16, -3, 1, 2, 4, -3, 5, 5, 10, 6, 4, -9, 7, -2, -1, 2, 11, -1, 5, -8, -2, -1, 4, -5, 5, 1, 3, -6, -1, -5, 0, 0, 0, -3, 5, -4, -6, -2, -2, -1, 2, -3, 0, 1, -3, -4, -3, 1, 1, 0, -2, -1, -1, 0, -5, 0, 4, -1, 1, 1, -1, 1, -1, -3, 2, 1, 2, -2, 1, 0, -1, -2, 2, -3, 4, -2, -2, 0, 1, -4, 0, -4, 2, -1, 0, -3, -1, -1, -1, -5, 2, -2, -2, 0, -3, -2, 1, -5, -2, 0, 0, 0, -2, -2, -1, -1, -1, -2, 0, 3, -3, -1, 0

So if I understand this correctly, my output here should be from -N to 0 to N. -N to 0 should look just like 0 to N. But when I look at these amplitudes, I don't see any mirrored data. Google seems to indicate that the output should be from 0 to N just on both sides of the spectrum. So I should be able to take the data from (output.length-1)/2 to output.length-1. The negative amplitudes are moving faster than the sample rate and the positive amplitudes are moving slower than the sample rate. Did I understand this correctly?

like image 986
ebolyen Avatar asked Jan 18 '11 04:01

ebolyen


2 Answers

In case it helps anyone, I've created a Visualizer which takes the output from the MediaPlayer and displays a visualization. It works with both normal waveform and FFT data:

https://github.com/felixpalmer/android-visualizer

It includes code for converting the output of getFft() into something visually meaningful.

like image 146
pheelicks Avatar answered Nov 17 '22 09:11

pheelicks


The frequency at FFT output sample k is given by:

Fk = k * Fs / N,    k = 0,1,...,N-1 

where

  • Fs is the sampling frequency of the time series input
  • N is the number of samples used to compute the FFT

The two sides of the spectrum refers to the positive and negative frequencies in the output of the FFT. The FFT forces the frequency output to be periodic with a period of Fs. If you look at the FFT output, it covers the frequencies from 0 to Fs. It is often advantageous to view the spectrum over the range of -0.5*Fs to 0.5*Fs instead by shifting the FFT output from 0.5*Fs -> Fs to -0.5*Fs -> 0 since they are equal because of the periodicity.

For real-valued signals, like the ones you have in audio processing, the negative frequency output will be a mirror image of the positive frequencies. Because of this, often only one side of the spectrum is used when analyzing real signals.

Another important point is the significance of 0.5*Fs which is known as the Nyquist Frequency. A signal can only accurately represent frequencies up to the Nyquist frequency and anything above it will be aliased (folded) back onto the spectrum causing distortion.

So really all you should worry about for visualization purposes are the FFT output samples corresponding to the range of frequencies from 0 to Fs/2 since those are the meaningful samples for a real signal with sampling rate Fs.

like image 20
Jason B Avatar answered Nov 17 '22 08:11

Jason B