I am currently working on my fourth year project (computer science) which involves the automatic transcription of music -> sheet music. I am doing it in Matlab at the moment but will have to be converted to java at some stage.
My problem: I have my program returning the correct notes for pure sine tones, I have now encountered a problem when it comes to the retrieval of the fundamental frequency from a note played by a natural instrument. With certain notes, the peak representing the fundamental of the note seems to be missing entirely. For example when I play a G3 note from garageband, it is shown as a G4, as only the 1st, 3rd, 5th and 7th harmonics are appearing in my plot. I tried to add the image but as this is my first post it wouldn't allow me. Any pointers in the right direction would be greatly appreciated.
This is not unusual. It's very common for the fundamental to be missing, or nearly so, for male voices, big string instruments, and many other pitched sound sources.
That makes using an FFT peak result alone extremely poor at determining musical notes from actual musical instruments, as opposed to sinewave function generators. That's because pitch is different from peak spectral frequency. Pitch is a psycho-acoustic perceptual phenomena. So that's what you need to read up on. There are tons of research papers on the subject.
So you need to look at a completely different set of algorithms. Try cepstrums (cepstral analysis), harmonic product spectrums, autocorrelation and similar (AMDF, ASDF, etc. lags), RAPT (Robust Algorithm for Pitch Tracking), YAAPT, etc.
ADDED: I wrote up a more detailed explanation of pitched sounds with missing fundamentals in a blog post.
It isn't unusual for the fundamental frequency of a musical instrument note to be attenuated relative to the harmonics (also known as overtones), and in some cases the fundamental frequency magnitude may be well below the magnitude of the overtones.
Take a look at this frequency/magnitude plot of a real bassoon (not a synthesized bassoon) playing a G3 note. Observe the attenuated fundamental (196.39 Hz) relative to the first harmonic. But also observe that all the integer-multiple harmonics are visible upto the 10th harmonic. Actually, many more harmonics are present, but they aren't visible on this linear magnitude plot.
In your case, the additional fact that your G3 musical note's spectrum is showing only the 1st, 3rd, 5th and 7th harmonics suggests that something is wrong. Your test sound appears to be synthesized, so the problem could be with the way the sound was synthesized.
The spectra of real musical instruments typically show the fundamental frequency and many integer-multiple harmonics such as 1, 2, 3 and so on, as seen above. And the harmonics typically extend well above 6KHz for most notes played on most instruments.
Take a look at this frequency/decibel_magnitude plot of a real bassoon (not a synthesized bassoon) playing a G3 note. Observe that a total of 37 integer-multiple harmonics are present, until they dissappear at the noise floor near -104 dB.
You can listen to this bassoon sample and see its spectrum here: Bassoon musical instrument spectrum
Also read this detailed post on analytical approaches to autonomous musical transcription
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With