Errors are like this:
Traceback (most recent call last): File "NearestCentroid.py", line 53, in <module> clf.fit(X_train.todense(),y_train) File "/usr/local/lib/python2.7/dist-packages/scikit_learn-0.13.1-py2.7-linux-i686.egg/sklearn/neighbors/nearest_centroid.py", line 115, in fit variance = np.array(np.power(X - self.centroids_[y], 2)) IndexError: arrays used as indices must be of integer (or boolean) type
Codes are like this:
distancemetric=['euclidean','l2'] for mtrc in distancemetric: for shrkthrshld in [None]: #shrkthrshld=0 #while (shrkthrshld <=1.0): clf = NearestCentroid(metric=mtrc,shrink_threshold=shrkthrshld) clf.fit(X_train.todense(),y_train) y_predicted = clf.predict(X_test.todense())
I am using scikit-learn
package, X-train
, y_train
are in LIBSVM format, X
is the feature:value pair, y_train
is the target/label, X_train
is in CSR matric format, the shrink_threshold
does not support CSR sparse matrix, so I add .todense()
to X_train
, then I got this error, could anyone help me fix this? Thanks a lot!
I had a similar problem using the Pystruct pystruct.learners.OneSlackSSVM
.
It occured because my training labels were floats, in stead of integers. In my case, it was because I initialized the labels with np.ones, without specifying dtype=np.int8. Hope it helps.
It happens quite often that an indexing array should be clearly integer
type by the way it is created, but in the case of empty list passed, becomes default float
, a case which might not be considered by the programmer. For example:
>>> np.array(xrange(1)) >>> array([0]) #integer type as expected >>> np.array(xrange(0)) >>> array([], dtype=float64) #does not generalize to the empty list
Therefore, one should always explicitely define the dtype
in the array constructor.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With