Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: How do I display clustered matrix heatmap (similar color patterns are grouped)

Tags:

r

ggplot2

heatmap

I searched a lot of questions about heatmap throughout the site and packages, but I still have a problem.
I have clustered data (kmeans/EM/DBscan..), and I want to create a heatmap by grouping the same cluster. I want the similar color patterns to be grouped in the heatmap, so generally, it looks like a block-diagonal.
I tried to order the data by the cluster number and display it,

k = kmeans(data, 3)
d = data.frame(data)
d = data.frame(d, k$cluster)
d = d[order(d$k.cluster),]
heatmap(as.matrix(d))
but it is still not sorted and looks like this link:enter image description here
But, I want it to be sorted by its cluster number and looked like this:enter image description here
Can I do this in R?
I searched lots of packages and tried many ways, but I still have a problem.
Thanks a lot.
like image 581
question Avatar asked Apr 16 '11 16:04

question


2 Answers

You can do this using reshape2 and ggplot2 as follows:

library(reshape2)
library(ggplot2)

# Create dummy data
set.seed(123)
df <- data.frame(
        a = sample(1:5, 1000, replace=TRUE),
        b = sample(1:5, 1000, replace=TRUE),
        c = sample(1:5, 1000, replace=TRUE)
)

# Perform clustering
k <- kmeans(df, 3)

# Append id and cluster
dfc <- cbind(df, id=seq(nrow(df)), cluster=k$cluster)

# Add idsort, the id number ordered by cluster 
dfc$idsort <- dfc$id[order(dfc$cluster)]
dfc$idsort <- order(dfc$idsort)

# use reshape2::melt to create data.frame in long format
dfm <- melt(dfc, id.vars=c("id", "idsort"))

ggplot(dfm, aes(x=variable, y=idsort)) + geom_tile(aes(fill=value))

enter image description here

like image 177
Andrie Avatar answered Sep 23 '22 07:09

Andrie


You should set Rowv and Colv to NA if you don't want the dendrograms and the subseuent ordering. BTW, You should also put of the scaling. Using the df of Andrie :

heatmap(as.matrix(df)[order(k$cluster),],Rowv=NA,Colv=NA,scale="none",labRow=NA)

enter image description here

In fact, this whole heatmap is based on image(). You can hack away using image to construct a plot exactly like you want. Heatmap is using layout() internally, so it will be diffucult to set the margins. With image you could do eg :

myHeatmap <- function(x,ord,xlab="",ylab="",main="My Heatmap",
                      col=heat.colors(5), ...){
    op <- par(mar=c(3,0,2,0)+0.1)
    on.exit(par(op))
    nc <- NCOL(x)
    nr <- NROW(x)
    labCol <- names(x)

    x <- t(x[ord,])
    image(1L:nc, 1L:nr, x, xlim = 0.5 + c(0, nc), ylim = 0.5 +
        c(0, nr), axes = FALSE, xlab=xlab, ylab=ylab, main=main,
        col=col,...)

    axis(1, 1L:nc, labels = labCol, las = 2, line = -0.5, tick = 0)
    axis(2, 1L:nr, labels = NA, las = 2, line = -0.5, tick = 0)
}

library(RColorBrewer)
myHeatmap(df,order(k$cluster),col=brewer.pal(5,"BuGn"))

To produce a plot that has less margins on the side. You can also manipulate axes, colors, ... You should definitely take a look at the RColorBrewerpackage

(This custom function is based on the internal plotting used by heatmap btw, simplified for the illustration and to get rid of all the dendrogram stuff)

enter image description here

like image 37
Joris Meys Avatar answered Sep 22 '22 07:09

Joris Meys