Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get inertia value for each k-means cluster using scikit-learn?

I'm using scikit learn for clustering (k-means). When I run the code with the verbose option, it prints the inertia for each iteration.

Once the algorithm finishes, I would like to get the inertia for each formed cluster (k inertia values). How can I achieve that?

like image 535
iamdeit Avatar asked Oct 28 '16 21:10

iamdeit


People also ask

How do you calculate inertia in k-means clustering?

K-Means: Inertia Inertia measures how well a dataset was clustered by K-Means. It is calculated by measuring the distance between each data point and its centroid, squaring this distance, and summing these squares across one cluster.

How you can implement k-means clustering using Scikit learn?

K-means clustering using scikit-learnWe set n_init=10 to run the k-means clustering algorithms 10 times independently with different random centroids to choose the final model as the one with the lowest SSE. Via the max_iter parameter, we specify the maximum number of iterations for each single run (here, 300 ).

How do you get samples in each cluster?

In all three types, you first divide the population into clusters, then randomly select clusters for use in your sample. In single-stage sampling, you collect data from every unit within the selected clusters. In double-stage sampling, you select a random sample of units from within the clusters.

What is inertia in elbow method?

Inertia: It is the sum of squared distances of samples to their closest cluster center.


1 Answers

I manage to get that information using fit_transform method and them getting the distance between each sample and its cluster.

model = cluster.MiniBatchKMeans(n_clusters=n)
distances = model.fit_transform(trainSamples)
variance = 0
i = 0
for label in model.labels_:
    variance = variance + distances[i][label]
    i = i + 1
like image 119
Raphael Adamski Avatar answered Sep 25 '22 14:09

Raphael Adamski