Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Transforming Audio Samples From Time Domain to Frequency Domain

as a software engineer I am facing with some difficulties while working on a signal processing problem. I don't have much experience in this area.

What I try to do is to sample the environmental sound with 44100 sampling rate and for fixed size windows to test if a specific frequency (20KHz) exists and is higher than a threshold value.

Here is what I do according to the perfect answer in How to extract frequency information from samples from PortAudio using FFTW in C

102400 samples (2320 ms) is gathered from audio port with 44100 sampling rate. Sample values are between 0.0 and 1.0

int samplingRate = 44100;
int numberOfSamples = 102400;
float samples[numberOfSamples] = ListenMic_Function(numberOfSamples,samplingRate);

Window size or FFT Size is 1024 samples (23.2 ms)

int N = 1024;

Number of windows is 100

int noOfWindows = numberOfSamples / N;

Splitting samples to noOfWindows (100) windows each having size of N (1024) samples

float windowSamplesIn[noOfWindows][N];
for i:= 0 to noOfWindows -1 
    windowSamplesIn[i] = subarray(samples,i*N,(i+1)*N);
endfor

Applying Hanning window function on each window

float windowSamplesOut[noOfWindows][N];
for i:= 0 to noOfWindows -1 
    windowSamplesOut[i] = HanningWindow_Function(windowSamplesIn[i]);
endfor

Applying FFT on each window (real to complex conversion done inside the FFT function)

float frequencyData[noOfWindows][samplingRate/2]; 
for i:= 0 to noOfWindows -1 
    frequencyData[i] = RealToComplex_FFT_Function(windowSamplesOut[i], samplingRate);
endfor

In the last step, I use the FFT function implemented in this link: http://www.codeproject.com/Articles/9388/How-to-implement-the-FFT-algorithm ; because I cannot implement an FFT function from the scratch.

What I can't be sure is while giving N (1024) samples to FFT function as input, samplingRate/2 (22050) decibel values is returned as output. Is it what an FFT function does?

I understand that because of Nyquist Frequency, I can detect half of sampling rate frequency at most. But is it possible to get decibel values for each frequency up to samplingRate/2 (22050) Hz?

Thanks, Vahit

like image 277
vaha Avatar asked Aug 19 '12 20:08

vaha


2 Answers

See see How do I obtain the frequencies of each value in an FFT?

From a 1024 sample input, you can get back 512 meaningful frequency-levels.

So, yes, within your window, you'll get back a level for the Nyquist frequency.

The lowest frequency level you'll see is for DC (0 Hz), and the next one up will be for SampleRate/1024, or around 44 Hz, the next for 2 * SampleRate/1024, and so on, up to 512 * SampleRate / 1024 Hz.

like image 176
david van brink Avatar answered Oct 17 '22 12:10

david van brink


Since only one band is used in your FFT, I would expect your results to be tarnished by side-band effects, even with proper windowing. It might work, but you might also get false positives with some input frequencies. Also, your signal is close to your niquist, so you are assuming a fairly good signal path up to your FFT. I don't think this is the right approach.

I think a better approach to this kind of signal detection would be with a high order filter (depending on your requirements, I would guess fourth or fifth order, which isn't actually that high). If you don't know how to design a high order filter, you could use two or three second order filters in series. Designing a second order filter, sometimes called a "biquad" is described here:

http://www.musicdsp.org/files/Audio-EQ-Cookbook.txt

albeit very tersely and with some assumptions of prior knowledge. I would use a high-pass (HP) filter with corner frequency as low as you can make it, probably between 18 and 20 kHz. Keep in mind there is some attenuation at the corner frequency, so after applying a filter multiple times you will drop a little signal.

After you filter the audio, take the RMS or average amplitude (that is, the average of the absolute value), to find the average level over a time period.

This technique has several advantages over what you are doing now, including better latency (you can start detecting within a few samples), better reliability (you won't get false-positives in response to loud signals at spurious frequencies), and so on.

This post might be of relevance: http://blog.bjornroche.com/2012/08/why-eq-is-done-in-time-domain.html

like image 27
Bjorn Roche Avatar answered Oct 17 '22 12:10

Bjorn Roche