Can anybody explain what the output of the K-Means clustering in WEKA actually means.
For example
kMeans
Number of iterations: 9
Within cluster sum of squared errors: 9434.911100488926
Missing values globally replaced with mean/mode
Cluster centroids:
Cluster#
Attribute Full Data 0 1
(400) (310) (90)
=================================================
competency134 0.0425 0.0548 0
competency207 0.0425 0.0548 0
competency263 0.01 0.0129 0
competency264 0.01 0.0129 0
competency282 0.01 0.0129 0
competency289 0.01 0.0129 0
What do the numbers in the columns actually mean, it says cluster centroids above the table but how is it possible to determine what the centroids of the two clusters are ?
If anybody could explain what the numbers mean I would be most grateful.
If anybody has any ideas how to complete a silhouette evaluation of the clusters found that would also be great.
Thanks
The first column gives you the overall population centroid. The second and third columns give you the centroids for cluster 0 and 1, respectively. Each row gives the centroid coordinate for the specific dimension.
I believe you need to brush up on your K-means. Finding the centroids is an essential part of the algorithm. The centroids are a result of a specific run of the algorithm and are not unique - a different run may generate a different centroid set.
Please see Michael Abernethy's description of Weka clustering for more details.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With