Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Confused about X in GaussianHMM.fit([X])

With this code:

X = numpy.array(range(0,5))
model = GaussianHMM(n_components=3,covariance_type='full', n_iter=1000)
model.fit([X])

I get

tuple index out of range 
self.n_features = obs[0].shape[1]

So what are you supposed to pass .fit() exactly? The hidden states AND emissions in a tuple? If so in what order? The documentation is less than helpful.

I noticed it likes being passed tuples as this does not give an error:

X = numpy.column_stack([range(0,5),range(0,5)])
model = GaussianHMM(n_components=3,covariance_type='full', n_iter=1000)
model.fit([X])

Edit:

Let me clarify a bit, the documentation indicates that the ordinality of the array must be:

List of array-like observation sequences (shape (n_i, n_features)).

This would almost indicate that you pass a tuple for each sample that indicates in a binary fashion which observations are present. However their example indicates otherwise:

# pack diff and volume for training
X = np.column_stack([diff, volume])

hence the confusion

like image 605
Brooks Avatar asked Jun 11 '15 19:06

Brooks


2 Answers

It would appear the GaussianHMM function is for multivariate-emission-only HMM problems, hence the requirement to have >1 emission vectors. When the documentation refers to 'n_features' they are not referring to the number of ways emissions can express themselves but the number of orthogonal emission vectors.

Hence, "features" (the orthogonal emission vectors) are not to be confused with "symbols" which, in sklearn's parlance (which is likely shared with the greater hmm community for all I know), refer to what actual unique values the system is capable of emitting.

For univariate emission-vector problems, use MultinomialHMM.

Hope that clarifies for anyone else who want to use this stuff without becoming the world's foremost authority on HMMs :)

like image 120
Brooks Avatar answered Sep 20 '22 17:09

Brooks


I realize this is an old thread but the problem in the example code is still there. I believe the example is now at this link and still giving the same error:

tuple index out of range 
self.n_features = obs[0].shape[1]

The offending line of code is: model = GaussianHMM(n_components=5, covariance_type="diag", n_iter=1000).fit(X)

Which should be: model = GaussianHMM(n_components=5, covariance_type="diag", n_iter=1000).fit([X])

like image 28
dixon1e Avatar answered Sep 22 '22 17:09

dixon1e