I'm designing a simple tuner, so my target is displaying a note name (A, B, F#) and the distance in cents between the theoretic sound and the actual input.
I'm completely new to audio and signal processing, so I did some research and I found a thing called Fast Fourier Transform that will analyze the bytes and will give me the frequency. Also, I found a couple of Java libraries like common math and JTransforms so I won't write the hard code myself.
I believed that's all, since each range frequency can be directly mapped to a note in the equal temperament, but then I found this new (to me) word called pitch: it's said to be tightly related to frequency, but is not exactly the same thing and is much more difficult to get, and belongs to that psychoacoustic area.
So my question is, can somebody clearly outline the differences between pitch and frequency and maybe tell me which a tuner deals with?
Frequency is simply the number of oscillations that a wave goes through per second. Any wave which is periodic has a frequency. But usually in music, use of the term is limited to talking about sine waves, so if you hear something about a wave of frequency x, it usually means a sine wave with that many oscillations per second.
Any arbitrary wave, whether periodic or not, can be constructed by adding up sine waves of various frequencies in varying amounts (that is with various amplitudes). What the Fourier transform does is tell you which frequencies to use, and with which amplitudes, to create any given wave. The fast Fourier transform (FFT) is a particular algorithm that computes the Fourier transform of a wave, given the data representing the amplitude of the wave as a function of time.
When you hear a musical note played by an instrument, it doesn't consist of just a single frequency. Instead, what you get is a combination of different multiples of a fundamental frequency, in different amounts. For example, a flute playing a particular note might produce a combination of
and so on. On the other hand, a trumpet playing the same note might produce a combination of
and so on. (Those are not the actual relative amplitudes for those instruments; I just made up some example numbers) So in your tuner application, when you run the FFT on incoming data, you will find multiple peaks in the output at various frequencies, depending on which instrument is being tuned. The point is that the output of the FFT will not just be a number; it won't just tell you "this instrument is playing a note at 440 Hz."
Now we get to pitch, which is a slightly more nebulous concept. The pitch of a note is basically what a person actually hears when exposed to that note. For many instruments, the pitch is correlated to the fundamental frequency being emitted by the instrument. But depending on the relative amplitudes of the higher frequencies, a person might perceive two instruments to have different pitches even if they are actually playing the same note.
Fortunately, if you're just making a simple tuner, you don't have to worry about pitch at all. The point of a tuner is to minimize beats between different instruments, and beats are caused by the actual frequencies, not the perceived pitches. A trumpet and a flute both playing with a fundamental frequency of 440 Hz will not exhibit beats because the differences between all their frequencies are multiples of 440 Hz, even if the untrained ear might think one of them is higher-pitched than the other.
Pitch is about the periodicity of the signal. It's true that it's based on psychoacoustics, but it is very accurate to say we are detecting the pseudo-periodicities of the signal when we hear a pitch.
The spectrum is the breakdown of the audio signal into a sum of sines and cosines of various frequencies. As David pointed out, usually when people talk about "Frequency" in a musical context, they are referring to the frequency of these sine waves that you broke the signal into. So the spectrum is looking at which of these sine components are large, and what frequencies they are at. The spectrum broadly represents the "high frequency" you hear in a high hat, and the "low frequency" you hear in the thud of a rock hitting the ground. Strictly speaking, neither of these sounds are periodic, nor do you perceive a pitch, but what you hear is the relative magnitudes of the high frequency and the low frequency parts of the spectrum
The Fourier Transform (or DFT/FFT) is the mathematical algorithm by which you break down your audio signal into the sums of sines and cosines. So by looking at the magnitude of these sines and cosines that you get out of the FFT, you get the Spectrum. A naive way of guessing the pitch is by looking directly at the spectrum of a short piece of audio, and assume that the biggest sine component of your signal corresponds to its fundamental periodicity.
I wrote up a very long answer to another post that I think will answer your questions of how to extract pitch: https://stackoverflow.com/a/7211695/94102 I'd strongly suggest reading it. It will give you the tools and understanding you need to make a high quality tuner.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With