Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to make MFCC algorithm?

Tags:

I wanna make the Mel-Frequency Cepstrum Algorithm but there are some things that I don't understand.

After FTT is done we need to "Map the powers of the spectrum obtained above onto the mel scale, using triangular overlapping windows."

I know how to calculate the triangles and I also know how to pass to mel scale. I simply don't know what to do with them.

If the triangles are defined, how do I map the power of the spectrum obtained above onto the mel scale?

Is it like this: Sum the frequencies inside the triangle and then pass it to mel scale? or Sum the frequencies inside the triangle according to a weight value (defined by the height of the triangle at that point) and then pass it to mel scale? or Pass all the frequencies inside the triangle to mel scale according to the weith value? Another thing?

Can anyone clarifies this to me

like image 683
aF. Avatar asked Oct 28 '09 15:10

aF.


People also ask

What is MFCC algorithm?

The MFCC gives a discrete cosine transform (DCT) of a real logarithm of the short-term energy displayed on the Mel frequency scale [21]. MFCC is used to identify airline reservation, numbers spoken into a telephone and voice recognition system for security purpose.

How is MFCC obtained?

MFCCs are obtained by taking Discrete Cosine Transform (DCT) of the spectral envelope.

What are the 39 MFCC features?

So the 39 MFCC features parameters are 12 Cepstrum coefficients plus the energy term. Then we have 2 more sets corresponding to the delta and the double delta values. Next, we can perform the feature normalization. We normalize the features with its mean and divide it by its variance.


1 Answers

I think this step of the process is a little weird and doesn't make complete sense (to me anyway). The center of the filter bands are equally spaced along the mel scale, but are triangles on the linear scale, i.e. just like the figure here.

Then calculate the weighted sum using these triangle along the linear x-axis. (In this previous step, I think that some approaches normalize by the filter-triangle's area, and some don't, and I'm honestly not sure about the final consequences here, though I suspect it may not mean much except to modify the final interpretation which are all relative comparisons anyway. One maintains total energy, and the other give equally weighted contributions per band.) Then take the log of this (which converts the overall volume factor to an offset).

Edit: To be more clear on applying the filters... Each triangle represents a separate filter, producing a separate weighted sum. If there twenty filters in your filter bank, there will be twenty triangles, and twenty weighted sums to calculate. To apply each filter, for each x-axis value multiple the filter value at that x-location by the function value at that x-location, and add this to the sum for that particular filter. Most x-axis values with have two filters that are present there, so at each x-location makes a contribution to two filters.

like image 171
tom10 Avatar answered Oct 12 '22 00:10

tom10