I am developing a speech recognition system from scratch using Octave. I am trying to detect phonemes by detecting differences in frequency. Currently I have read in a wav file, organized the values into blocks and applied fft
to the overall data. After, I plot the new data with plot(abs(real(fft(q))))
which creates this graph:
How could I get the frequency values (the peaks of the graph)?
Look for points where the value for the difference function goes from positive to negative. Those are your peak points. To find the most prominent peaks, compute the second order difference function at the points obtained from the first order difference and select the ones which are of highest magnitude.
If you give your frequency vector as the second argument to findpeaks, the second output 'locs', will be in terms of frequency. See the documentation example Find Peaks with Minimum Separation (link) for details. [pks,frqs] = findpeaks(abs(X),freq);
Accepted Answer The peaks are the frequencies at which the vibration amplitude is maximal. Here, they appear to be at about 900 Hz, and about 1200-1300 Hz.
If you don't have access to findpeaks
, the basic premise behind how it works is that for each point in your signal, it searches a three element window that is centred at this point and checks to see whether the centre of this window is larger than the left and right element of this window. You want to be able to find both positive and negative peaks, so you'd need to check the absolute value.
As such, what you can do is make two additional signals that shift the signal to the left by 1 and to the right by 1. When we do this, we will actually be checking for peaks starting at the second element in your signal, in order to make room for looking to the left. We keep checking up until the second last element, in order to make room for looking to the right. Therefore, we will actually be checking for peaks on a N - 2
version of the signal where N
is the length of your signal. Therefore, when we create the left shifted signal, we extract the first element of the signal up until the third last element. When we create the right shifted signal, we extract from the third element up until the last element. The original signal will simply have its first and last elements removed.
Therefore, by checking for peaks this way, we will lose out on the first and last point of your data, but that should be suitable as there most likely won't be any peaks at the beginning and at the end. After, creating all of these signals, simply use logical indexing to see whether the corresponding values in the original signal (without the first and last elements) are larger than the other two signals in their corresponding positions.
As such, supposing your signal was stored in f
, you would do the following:
f1 = abs(f(2:end-1)); %// Original signal
f2 = abs(f(1:end-2)); %// Left shift
f3 = abs(f(3:end)); %// Right shift
idx = find(f1 > f2 & f1 > f3) + 1; %// Get the locations of where we find our peaks
idx
will contain the index locations of where the peaks occur. Bear in mind that we started searching for peaks at the second position, and so you need to add 1 to accommodate for this shift. If you wanted to find the actual time (or frequency in your case) values, you would just use idx
to index into the time (or frequency) array that was used to generate your signal and find them. As such, let's use an artificial case where I generate a sinusoid from 0 to 3 seconds with a frequency of 1 Hz. Therefore:
t = 0 : 0.01 : 3;
f = sin(2*pi*t);
Now, if we ran the above code with this signal, we'd find the location of our peaks. We can then use these locations to index into t
and f
and plot the signal as well as where we have detected our peaks. Therefore:
plot(t, f, t(idx), f(idx), 'r.')
This is what I get:
Bear in mind that this is a very simple way of detecting peaks, but that is what is essentially done in findpeaks
. If you used the above code, it would basically find all peaks. As such, the code would find dozens of peaks in that above graph, because there are local maxima all over your spectrum. You probably want to determine where the strong peaks are located. What people usually do is use a threshold to signify how large the peak should be before deciding whether that is a valid peak. As such, you can enforce a threshold, and do something like this:
thresh = ... ; %// Define threshold here
idx = find(f1 > f2 & f1 > f3 & f1 > thresh) + 1; %// Get the locations of where we find our peaks
In your case for your graph, you may want to set this so that you find any peaks whose magnitude is larger than 10 perhaps.
There are a lot of other things that findpeaks
does, such as filtering out noisy peaks and some other robust measures. If you want to use findpeaks
, you need to make sure that you install signal package. You can simply use pkg install
from the Octave Command Prompt and install the signal
package. Specifically, try this:
pkg install -forge signal
Once you install the signal
package, you can load it into the Octave environment by doing:
pkg load signal
If you have to install dependencies, it'll tell you when you try to install the signal
package. Check out this link for more details: https://www.gnu.org/software/octave/doc/interpreter/Installing-and-Removing-Packages.html
mkoctfile
stands for making / compiling an Octave file. If you don't have mkoctfile
, make sure you have the most recent version of Octave installed. What I recommend you do to make things simple is to install either Homebrew or MacPorts and get Octave in that fashion. Once you install it, then you should be able to get mkoctfile
working. However, if you still can't, you may need to have a compatible compiler installed. The easy approach is to install the Command Line Developer tools from Xcode. Go to this link then go to Additional Tools.
Good luck!
You can use findpeaks function from octave signal package:
http://octave.sourceforge.net/signal/function/findpeaks.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With