Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cutting a dendrogram in R

I am trying to cut this dendrogram into 3 groups: (T24, T1, T17, etc.), (T12, T15, T6, etc.) and (T2, T8, T3, T9)

enter image description here

I have tried using cutree(hc, k=3, h=400) , but it continues to make the same groups. Any help is greatly appreciated. Here is my code.

#temps must have date/time as column headers, not row headers
load(temps)
distMatrix <- dist(temps)
#create label colors
labelColors = c("#E41A1C", "#377EB8", "#4DAF4A", "#984EA3", "#FF7F00", "#FFFF33")
# cut dendrogram in 3 clusters
clusMember = cutree(hc, k=3, h=400)
colLab <- function(n) {
  if (is.leaf(n)) {
  a <- attributes(n)
  labCol <- labelColors[clusMember[which(names(clusMember) == a$label)]]
  attr(n, "nodePar") <- c(a$nodePar, lab.col = labCol)
  }
  n
}
hcd = as.dendrogram(hc)
clusDendro = dendrapply(hcd, colLab)
plot(clusDendro, main = "Cluster Analysis")
like image 553
derp4herps Avatar asked Oct 19 '22 17:10

derp4herps


1 Answers

You example is not reproducible in the sense that we don't have access to the data. What I can say is that you should look at the dendextend R package. It offers functions for cutting a dendrogram, as well as coloring labels and branches. The Quick Introduction manual shows the basic use of functions such as labels_colors and color_branches for producing plots such as this:

enter image description here

In your case, since your branches seem to be in the same height, you are not likely to have direct control over their cuts. What you could do is use the branches_attr_by_clusters to specifically control the the colors of the sub-clusters you care for. Here is an example:

x <- c(1:3, 6:8)
dend <- as.dendrogram(hclust(dist(x), method = "ave"))
library(dendextend)
labels(dend) <- x[order.dendrogram(dend)]

# due to the ties - there is specific reason to have this be these 3 clusters:
cutree(dend, k = 3)[order.dendrogram(dend)]

par(mfrow = c(1,2))
dend1 <- color_branches(dend, k = 3)
dend1 <- color_labels(dend1, k = 3)
plot(dend1, main = "default cuts by cutree")
# let's force it to be another 3 clusters:
dend2 <- branches_attr_by_clusters(dend, c(1, 2,2, 3,3,3), c("gold", "darkgreen", "blue"))
# coloring the labels is actually the easiest part:
labels_colors(dend2) <- c("gold", "darkgreen", "blue")[c(1, 2,2, 3,3,3)]
plot(dend2, main = "Manual cuts")

enter image description here

like image 136
Tal Galili Avatar answered Nov 16 '22 10:11

Tal Galili