Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can i get sum of squared errors(SSE) from k means algorithm?

Tags:

r

I Have a data frame with two columns and 450 rows. First I have to run a K-means algorithm with different k values(meaning k clusters). And for each time I run a different k value I have to calculate the SSE. I have just the mathematical equation given. SSE is calculated by squaring each points distance to its respective clusters centroid and then summing everything up. So at the end I should have SSE for each k value.

I have gotten to the place where you run the k means algorithm:

Data.kemans <- kmeans(data, centers = 3)

How could I get the SSE (sum of squared errors) from this data.kmeans ?

like image 484
bezpajumtnieks Avatar asked Oct 28 '25 03:10

bezpajumtnieks


2 Answers

If you are using scikit-learn to calculate the SSE value, then there is a built-in attribute .inertia_ for that.

from sklearn. cluster import KMeans
kmeans = KMeans()
kmeans.fit(your_data)
kmeans.inertia_ #returns the SSE value
like image 150
Ahmad Javan Avatar answered Oct 29 '25 20:10

Ahmad Javan


I think this is returned by kmeans. The documentation says:

Value

kmeans returns an object of class "kmeans" which has a print and a fitted method. It is a list with at least the following components:

(...)

totss
The total sum of squares.

withinss
Vector of within-cluster sum of squares, one component per cluster.

tot.withinss
Total within-cluster sum of squares, i.e. sum(withinss).

betweenss
The between-cluster sum of squares, i.e. totss-tot.withinss.

Hence, Data.kmeans$withinss should give you the answer you are looking for.

like image 45
Karsten W. Avatar answered Oct 29 '25 19:10

Karsten W.



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!