I am using this algorithm to detect the pitch of this audio file. As you can hear, it is an E2 note played on a guitar with a bit of noise in the background. I generated this spectrogram using STFT:<img src="https://i.stack.imgur.com/HHmGd.png" alt="spectrogram"> And I am using the algorithm linked above like this: <pre class="prettyprint"><code>y, sr = librosa.load(filename, sr=40000) pitches, magnitudes = librosa.core.piptrack(y=y, sr=sr, fmin=75, fmax=1600) np.set_printoptions(threshold=np.nan) print pitches[np.nonzero(pitches)] </code></pre> As a result, I am getting pretty much every possible frequency between my <code>fmin</code> and <code>fmax</code>. What do I have to do with the output of the <code>piptrack</code> method to discover the fundamental frequency of a time frame? UPDATE I am still not sure what those 2D array represents, though. Let's say I want to find out how strong is 82Hz in frame 5. I could do that using the STFT function which simply returns a 2D matrix (which was used to plot the spectrogram). However, <code>piptrack</code> does something additional which could be useful and I don't really understand what. <code>pitches[f, t] contains instantaneous frequency at bin f, time t</code>. Does that mean that, if I want to find the maximum frequency at time frame t, I have to: <ol> <li>Go to the <code>magnitudes[][t]</code> array, find the bin with the maximum magnitude. </li> <li>Assign the bin to a variable <code>f</code>. </li> <li>Find <code>pitches[b][t]</code> to find the frequency that belongs to that bin?</li> </ol>

Turns out the way to pick the pitch at a certain frame <code>t</code> is simple: <pre class="prettyprint"><code>def detect_pitch(y, sr, t): index = magnitudes[:, t].argmax() pitch = pitches[index, t] return pitch </code></pre> First getting the bin of the strongest frequency by looking at the <code>magnitudes</code> array, and then finding the pitch at <code>pitches[index, t]</code>.

Librosa pitch tracking - STFT

Tags:

python

signal-processing

pitch-tracking

librosa

I am using this algorithm to detect the pitch of this audio file. As you can hear, it is an E2 note played on a guitar with a bit of noise in the background.

I generated this spectrogram using STFT:

And I am using the algorithm linked above like this:

y, sr = librosa.load(filename, sr=40000)
pitches, magnitudes = librosa.core.piptrack(y=y, sr=sr, fmin=75, fmax=1600)

np.set_printoptions(threshold=np.nan)
print pitches[np.nonzero(pitches)]

As a result, I am getting pretty much every possible frequency between my fmin and fmax. What do I have to do with the output of the piptrack method to discover the fundamental frequency of a time frame?

UPDATE

I am still not sure what those 2D array represents, though. Let's say I want to find out how strong is 82Hz in frame 5. I could do that using the STFT function which simply returns a 2D matrix (which was used to plot the spectrogram).

However, piptrack does something additional which could be useful and I don't really understand what. pitches[f, t] contains instantaneous frequency at bin f, time t. Does that mean that, if I want to find the maximum frequency at time frame t, I have to:

Go to the magnitudes[][t] array, find the bin with the maximum magnitude.
Assign the bin to a variable f.
Find pitches[b][t] to find the frequency that belongs to that bin?

1000

asked May 09 '17 19:05

pavlos163

2 Answers

Pitch detection is a tricky topic and is often counter-intuitive. I'm not wild about the way the source code is documented for this particular function -- it almost seems like the developer is confusing a 'harmonic' with a 'pitch'.

When a single note (a 'pitch') is made on a guitar or piano, what we hear is not just one frequency of sound vibration, but a composite of multiple sound vibrations occurring at different mathematically related frequencies, called harmonics. Typical pitch tracking techniques include searching the results of a FFT for magnitudes in certain bins that correspond to the expected frequencies of harmonics. For instance, if we press the Middle C key on the piano, the individual frequencies of the composite's harmonics will start at 261.6 Hz as the fundamental frequency, 523 Hz would be the 2nd Harmonic, 785 Hz would be the 3rd Harmonic, 1046 Hz would be the 4th Harmonic, etc. The later harmonics are integer multiples of the fundamental frequency, 261.6 Hz ( ex: 2 x 261.6 = 523, 3 x 261.6 = 785, 4 x 261.6 = 1046 ). However, the frequencies where harmonics are located are logarithmically spaced, but the FFT uses a linear spacing. Often the vertical spacing for FFTs are not resolved enough at the lower frequencies.

For that reason when I wrote a pitch detecting application (PitchScope Player), I chose to create a logarithmically spaced DFT, rather than a FFT, so I could focus on the precise frequencies of interest for music ( see the attached diagram of my custom DFT from 3 seconds of a guitar solo ). If you are serious about pursuing pitch detection, you should consider doing more reading into the topic, looking at other sample code (mine is linked below), and consider writing your own functions to measure frequency.

https://en.wikipedia.org/wiki/Transcription_(music)#Pitch_detection

https://github.com/CreativeDetectors/PitchScope_Player

enter image description here

162

answered Oct 06 '22 01:10

James Paul Millard

Turns out the way to pick the pitch at a certain frame t is simple:

def detect_pitch(y, sr, t):
  index = magnitudes[:, t].argmax()
  pitch = pitches[index, t]

  return pitch

First getting the bin of the strongest frequency by looking at the magnitudes array, and then finding the pitch at pitches[index, t].

answered Oct 05 '22 23:10

pavlos163

Related questions
                            
                                How to use the cross-spectral density to calculate the phase shift of two related signals
                            
                                Python list set value at index if index does not exist
                            
                                How to forward-declare/prototype a function in Python? [duplicate]
                            
                                Matlab equivalent of Python enumerate
                            
                                How to use dill to serialize a class definition?
                            
                                Displaying subprocess output to stdout and redirecting it
                            
                                How to import own module for mocking? (import error: no module named my_module!)
                            
                                How to change text of a label in the kivy language with python
                            
                                Overriding Django REST ViewSet with custom post method and model
                            
                                py.test doesn't find module
                            
                                Python Unit Testing with two mock objects, how to verify call-order?
                            
                                Grouping constants in python
                            
                                django - using of related_name in ManyToMany and in ForeignKey
                            
                                Draw Ellipse in Python PIL with line thickness
                            
                                Can variables be decorated? [closed]
                            
                                How to select cells greater than a value in a multi-index Pandas dataframe?
                            
                                Does Seaborn distplot not support a range?
                            
                                Optimization of arithmetic expressions - what is this technique called?
                            
                                How do I include .dll file in executable using pyinstaller?
                            
                                Python dynamic multiprocessing and signalling issues

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Librosa pitch tracking - STFT

Tags:

python

signal-processing

pitch-tracking

librosa

pavlos163

People also ask

2 Answers

James Paul Millard

pavlos163

Recent Activity

Donate For Us