Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ggplot2 and ggdendro - plotting color bars under the node leaves

Currently I'm using ggplot2 and ggdendro to plot dendrograms. However Now I'm in need to plot a discrete variable under the leaves along with the labels.

For instance, in a publication (Zhang et al., 2006) I saw a dendrogram like this (notice th color bar under the leaf labels):

Example dendrogram

I'm interested in doing the same with ggdendro + ggplot2, using data which I have already binned. Is this possible?

like image 363
Einar Avatar asked Nov 12 '13 10:11

Einar


1 Answers

First, you need to make dataframe for the color bar. For example I used data USArrests - made clustering with hclust() function and saved the object. Then using this clustering object divided it in cluster using function cutree() and saved as column cluster. Column states contains labels of clustering object hc and the levels of this object are ordered the same as in output of hc.

library(ggdendro)
library(ggplot2)
hc <- hclust(dist(USArrests), "ave")
df2<-data.frame(cluster=cutree(hc,6),states=factor(hc$labels,levels=hc$labels[hc$order]))
head(df2)
           cluster     states
Alabama          1    Alabama
Alaska           1     Alaska
Arizona          1    Arizona
Arkansas         2   Arkansas
California       1 California
Colorado         2   Colorado

Now save as objects two plots - dendrogram and colorbar that is made with geom_tile() using states as x values and cluster number for colors. Formatting is done to remove all axis.

p1<-ggdendrogram(hc, rotate=FALSE)


p2<-ggplot(df2,aes(states,y=1,fill=factor(cluster)))+geom_tile()+
  scale_y_continuous(expand=c(0,0))+
  theme(axis.title=element_blank(),
        axis.ticks=element_blank(),
        axis.text=element_blank(),
        legend.position="none")

Now you can use answer of @Baptiste to this question to align both plots.

library(gridExtra)

gp1<-ggplotGrob(p1)
gp2<-ggplotGrob(p2)  

maxWidth = grid::unit.pmax(gp1$widths[2:5], gp2$widths[2:5])
gp1$widths[2:5] <- as.list(maxWidth)
gp2$widths[2:5] <- as.list(maxWidth)

grid.arrange(gp1, gp2, ncol=1,heights=c(4/5,1/5))

enter image description here

like image 71
Didzis Elferts Avatar answered Oct 22 '22 02:10

Didzis Elferts