I am trying to make a silhouette plot for a k-means clustering, but the bars are almost invisble. How can I make this chart legible?
Example code:
require(cluster)
X <- EuStockMarkets
kmm <- kmeans(X, 8)
D <- daisy(X)
plot(silhouette(kmm$cluster, D), col=1:8)
Example output:
Silhouette VisualizerThe score is calculated by averaging the silhouette coefficient for each sample, which is computed as the difference between the average intra-cluster distance and the mean nearest-cluster distance for each sample, normalized by the maximum value.
The value of the silhouette coefficient is between [-1, 1]. A score of 1 denotes the best meaning that the data point i is very compact within the cluster to which it belongs and far away from the other clusters. The worst value is -1. Values near 0 denote overlapping clusters.
In this method, we pick a range of candidate values of k, then apply K-Means clustering using each of the values of k. Find the average distance of each point in a cluster to its centroid, and represent it in a plot. Pick the value of k, where the average distance falls suddenly.
The silhouette value is a measure of how similar an object is to its own cluster (cohesion) compared to other clusters (separation). The silhouette ranges from −1 to +1, where a high value indicates that the object is well matched to its own cluster and poorly matched to neighboring clusters.
To fix this, set the border to NA:
plot(silhouette(kmm$cluster, D), col=1:8, border=NA)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With