Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pitch detection with computeSpectrum() return FFT values

  • I'm developing using Actionscript 3.0 for Flash Player 10.3
  • I'm using computeSpectrum() on a loaded .mp3
  • Running *Event.ENTER_FRAME* to get snapshots of each sample in an byteArray
  • The ByteArray contains 512 values (256 for each channel). These values are FFT Spectrum, ranging from 0 to 1.
  • I can't use the peak frequency for each of the samples (as I found found out!) because the highest value is not necessarily the fundamental frequency! As a result I'm getting lots of random values all over the place! Of course I'm getting some correct too, but that's not enough!

I found out about auto-correlation...
Can someone give me an example on how I could use it?

Or links, or example scripts even from other scripting languages to get a grip on it?

Regards
initcode

like image 721
initcode Avatar asked Jun 08 '11 17:06

initcode


2 Answers

Sounds like you already understand how to get an FFT spectrum, right?

spectrumhttp://flic.kr/p/7notw6

But if you're looking for the fundamental (green dot), you can't just use the highest peak. It's not necessarily the fundamental. In my example, the actual fundamental is 100 Hz, but the highest peak is 300 Hz.

There are a lot of different ways you could find the true fundamental, and each works better in different contexts. One thread on comp.dsp mentions "FFT, cepstrum, auto/cross-correlation, AMDF/ASDF".

For a simple example, each of the red dots is 100 Hz away from its neighbor, so if you used a peak-finding algorithm and then averaged together the distance between each harmonic and the next, you'd find the fundamental, but this would fail if any of the peaks were missed, or extra peaks included, or if the signal was symmetrical and only contained odd harmonics (1f, 3f, 5f). You'd need to find the mode and then throw away outliers and then average. This is probably an error-prone method.

You could also do an autocorrelation of the original waveform. Conceptually, this means sliding a copy of the waveform past itself, and finding the delay at which it best lines up with itself (which will be one complete cycle). In normal implementation, we use the FFT, though, to speed it up. Autocorrelation is basically

  • IFFT(FFT(signal)⋅FFT(signal)*)

where * means complex conjugate, or time reversal. In Python, for instance:

correlation = fftconvolve(sig, sig[::-1], mode='full')

and the source for fftconvolve() is relatively simple: https://github.com/scipy/scipy/blob/master/scipy/signal/signaltools.py#L133

like image 194
endolith Avatar answered Oct 05 '22 23:10

endolith


You can use the Harmonic Product Spectrum method to estimate the distance (frequency difference) between overtone peaks in a frequency spectrum (FFT results), even if some peaks are missing, as long as there are not too many spurious frequency peaks (noise).

To do a Harmonic Product Spectrum, print the FFT out on semi-transparent paper and roll it up into a cylinder (or do the equivalent in software). Wrap the cylinder tighter and tighter until the greatest amount of peaks overlap. The circumference will be a good estimate of the pitch. This works for any musical sounds that have lots of harmonics, even if a fundamental pitch frequency peak is missing or weak.

like image 38
hotpaw2 Avatar answered Oct 06 '22 01:10

hotpaw2