Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python k-means algorithm

I am looking for Python implementation of k-means algorithm with examples to cluster and cache my database of coordinates.

like image 534
Eeyore Avatar asked Oct 09 '09 19:10

Eeyore


People also ask

How do you evaluate K-means clustering in Python?

We need to calculate SSE to evaluate K-Means clustering using Elbow Criterion. The idea of the Elbow Criterion method is to choose the k (no of cluster) at which the SSE decreases abruptly. The SSE is defined as the sum of the squared distance between each member of the cluster and its centroid.

What is k-means algorithm with example?

K-means clustering algorithm computes the centroids and iterates until we it finds optimal centroid. It assumes that the number of clusters are already known. It is also called flat clustering algorithm. The number of clusters identified from data by algorithm is represented by 'K' in K-means.

How many clusters in k-means python?

# k is range of number of clusters. The optimal number of clusters based on Silhouette Score is 4.


1 Answers

Update: (Eleven years after this original answer, it's probably time for an update.)

First off, are you sure you want k-means? This page gives an excellent graphical summary of some different clustering algorithms. I'd suggest that beyond the graphic, look especially at the parameters that each method requires and decide whether you can provide the required parameter (eg, k-means requires the number of clusters, but maybe you don't know that before you start clustering).

Here are some resources:

  • sklearn k-means and sklearn other clustering algorithms

  • scipy k-means and scipy k-means2

Old answer:

Scipy's clustering implementations work well, and they include a k-means implementation.

There's also scipy-cluster, which does agglomerative clustering; ths has the advantage that you don't need to decide on the number of clusters ahead of time.

like image 51
tom10 Avatar answered Oct 17 '22 12:10

tom10