Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MATLAB Murphy's HMM Toolbox

I am trying to learn HMM GMM implementation and created a simple model to detect some certain sounds (animal calls etc.)

I am trying to train a HMM (Hidden Markov Model) network with GMM (Gaussian Mixtures) in MATLAB.

I have a few questions, I could not be able to find any info about.

1) Should mhmm_em() function be called in a loop for each HMM-state or it is automatically done?

Such as:

 for each state
        Initialize GMM’s and get parameters (use mixgauss_init.m)
    end
    Train HMM with EM (use mhmm_em.m)

2)

[LL, prior1, transmat1, mu1, Sigma1, mixmat1] = ...
                            mhmm_em(MFCCs, prior0, transmat0, mu0, Sigma0, mixmat0, 'max_iter', M);

The last Parameter, should it be the number of Gaussians or a number_of_states-1?

3) If we are looking for Maximum likelihood, then where the Viterbi comes into play?

Say if I want to detect a certain type of animal/human call after training my model with the accoustic feature-vectors that I have extracted, should I still need a Viterbi algorithm in test mode?

It is a little bit confusing me and I would highly appreciate an explanation for this part.

Any comments for the code in terms of HMM GMM logic would also be appreciated.

Thanks

Here is my MATLAB routine;

O = 21;            % Number of coefficients in a vector(coefficient)
M = 10;            % Number of Gaussian mixtures
Q = 3;             % Number of states (left to right)
%  MFCC Parameters
Tw = 128;           % analysis frame duration (ms)
Ts = 64;           % analysis frame shift (ms)
alpha = 0.95;      % preemphasis coefficient
R = [ 1 1000 ];    % frequency range to consider
f_bank = 20;       % number of filterbank channels 
C = 21;            % number of cepstral coefficients
L = 22;            % cepstral sine lifter parameter(?)

%Training
[speech, fs, nbits ] = wavread('Train.wav');
[MFCCs, FBEs, frames ] = mfcc( speech, fs, Tw, Ts, alpha, hamming, R, f_bank, C, L );
cov_type = 'full'; %the covariance type that is chosen as ҦullҠfor gaussians.
prior0 = normalise(rand(Q,1));
transmat0 = mk_stochastic(rand(Q,Q));
[mu0, Sigma0] = mixgauss_init(Q*M, dat, cov_type, 'kmeans');

mu0 = reshape(mu0, [O Q M]);
Sigma0 = reshape(Sigma0, [O O Q M]);
mixmat0 = mk_stochastic(rand(Q,M));
[LL, prior1, transmat1, mu1, Sigma1, mixmat1] = ...
mhmm_em(MFCCs, prior0, transmat0, mu0, Sigma0, mixmat0, 'max_iter', M);

%Testing
for i = 1:length(filelist)
  fprintf('Processing %s\n', filelist(i).name);
  [speech_tst, fs, nbits ] = wavread(filelist(i).name);
  [MFCCs, FBEs, frames ] = ...
   mfcc( speech_tst, fs, Tw, Ts, alpha, hamming, R, f_bank, C, L);
  loglik(i) = mhmm_logprob( MFCCs,prior1, transmat1, mu1, Sigma1, mixmat1);
end;
[Winner, Winner_idx] = max(loglik);
like image 670
bluemustang Avatar asked Oct 31 '14 15:10

bluemustang


1 Answers

1) No, EM estimates the model as a whole after you initialized it with kmeans. It doesn't estimate states separately.

2) Neither, last parameter in your code is the value of 'max_iter', it is the number of iterations of EM. Usually it's something around 6. It should not be M.

3) Yes, you need Viterbi in test mode.

like image 88
Nikolay Shmyrev Avatar answered Oct 05 '22 04:10

Nikolay Shmyrev