I am attempting to run the first answer to this question Python Relating k-means cluster to instance however I am getting the following error:
Traceback (most recent call last):
File "test.py", line 16, in <module>
model = sklearn.cluster.k_means(a, clust_centers)
File "/usr/local/lib/python2.7/dist-packages/scikit_learn-0.14.1-py2.7-linux-i686.egg/sklearn/cluster/k_means_.py", line 267, in k_means
x_squared_norms=x_squared_norms, random_state=random_state)
File "/usr/local/lib/python2.7/dist-packages/scikit_learn-0.14.1-py2.7-linux-i686.egg/sklearn/cluster/k_means_.py", line 386, in _kmeans_single
centers = _k_means._centers_dense(X, labels, n_clusters, distances)
File "_k_means.pyx", line 280, in sklearn.cluster._k_means._centers_dense (sklearn/cluster/_k_means.c:4268)
ValueError: Buffer dtype mismatch, expected 'DOUBLE' but got 'float'
When I ran this program the first time, it worked. But subsequent runs fail with that error.
Systems specs:
Python 2.7.3 (default, Sep 26 2013, 20:08:41)
[GCC 4.6.3] on linux2
numpy.__version__
'1.8.0'
sklearn.__version__
'0.14.1'
ubuntu 12.04
The SSE is defined as the sum of the squared Euclidean distances of each point to its closest centroid. Since this is a measure of error, the objective of k-means is to try to minimize this value.
Step-1: Select the value of K, to decide the number of clusters to be formed. Step-2: Select random K points which will act as centroids. Step-3: Assign each data point, based on their distance from the randomly selected points (Centroid), to the nearest/closest centroid which will form the predefined clusters.
I ran in to this issue while trying to run k-means on my own data. Creating a new array with data type 'double' solved my issue.
array_double = np.array(a, dtype=np.double)
My data was previously stored as 'float32'.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With