Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ggplot2: More complex faceting

I have a heatmap that continues to become more and more complex. An example of the melted data:

head(df2)
  Class     Subclass         Family               variable value
1     A chemosensory family_1005117 caenorhabditis_elegans    10
2     A chemosensory family_1011230 caenorhabditis_elegans     4
3     A chemosensory family_1022539 caenorhabditis_elegans    10
4     A        other family_1025293 caenorhabditis_elegans    NA
5     A chemosensory family_1031345 caenorhabditis_elegans    10
6     A chemosensory family_1033309 caenorhabditis_elegans    10
tail(df2)
     Class Subclass        Family        variable value
6496     C  class c family_455391 trichuris_muris     1
6497     C  class c family_812893 trichuris_muris    NA
6498     F  class f family_225491 trichuris_muris     1
6499     F  class f family_236822 trichuris_muris     1
6500     F  class f family_276074 trichuris_muris     1
6501     F  class f family_768194 trichuris_muris    NA

Using ggplot2 and geom_tile, I was able to produce a beautiful heatmap of the data. I am proud of the code (this is my first experience in R), so have posted it below:

df2[df2 == 0] <- NA
df2[df2 > 11] <- 10
df2.t <- data.table(df2)
df2.t[, clade := ifelse(variable %in% c("pristionchus_pacificus", "caenorhabditis_elegans", "ancylostoma_ceylanicum", "necator_americanus", "nippostrongylus_brasiliensis", "angiostrongylus_costaricensis", "dictyocaulus_viviparus", "haemonchus_contortus"), "Clade V",
                 ifelse(variable %in% c("meloidogyne_hapla","panagrellus_redivivus", "rhabditophanes_kr3021", "strongyloides_ratti"), "Clade IV",
                 ifelse(variable %in% c("toxocara_canis", "dracunculus_medinensis", "loa_loa", "onchocerca_volvulus", "ascaris_suum", "brugia_malayi", "litomosoides_sigmodontis", "syphacia_muris", "thelazia_callipaeda"), "Clade III",
                 ifelse(variable %in% c("romanomermis_culicivorax", "trichinella_spiralis", "trichuris_muris"), "Clade I",
                 ifelse(variable %in% c("echinococcus_multilocularis", "hymenolepis_microstoma", "mesocestoides_corti", "taenia_solium", "schistocephalus_solidus"), "Cestoda",
                 ifelse(variable %in% c("clonorchis_sinensis", "fasciola_hepatica", "schistosoma_japonicum", "schistosoma_mansoni"), "Trematoda", NA))))))]
df2.t$clade <- factor(df2.t$clade, levels = c("Clade I", "Clade III", "Clade IV", "Clade V", "Cestoda", "Trematoda"))
plot2 <- ggplot(df2.t, aes(variable, Family))
tile2 <- plot2 + geom_tile(aes(fill = value)) + facet_grid(Class ~ clade, scales = "free", space = "free")
tile2 <- tile2 + scale_x_discrete(expand = c(0,0)) + scale_y_discrete(expand = c(0,0))
tile2 <- tile2 + theme(axis.text.y = element_blank(), axis.ticks.y = element_blank(), legend.position = "right", axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.55), axis.text.y = element_text(size = rel(0.35)), panel.border = element_rect(fill=NA,color="grey", size=0.5, linetype="solid"))
tile2 <- tile2 + xlab(NULL)
tile2 <- tile2 + scale_fill_gradientn(breaks = c(1,2,3,4,5,6,7,8,9,10),labels = c("1", "2", "3", "4", "5", "6", "7", "8", "9", ">10"), limits = c(1, 10), colours = palette(11), na.value = "white", name = "Members")'

As you can see, there is quite a bit of manual reordering involved, otherwise the code is pretty simple. Here's the image output:

heatmap

However, you may notice that a whole column of information, "Subclass" is not utilized. Basically, each Subclass fits within a Class. It would be perfect if I was able to facet these Subclasses within the Class facet already displayed. As far as I know, this is impossible. To be precise, only Class A has varying Subclasses. The other Classes simply have their class name mirrored (F = class f). Is there another way to organize this heatmap so that I can display all of the relevant information? The missing Subclasses contain some of the most crucial data and will be the most necessary for drawing inferences from the data.

An alternative approach would be to facet the Subclases instead of the Classes, manually reorder them so that the Classes are clustered together, and then draw some sort of box around them to demarcate each Class. I have no idea how this would be done.

Any help would be very useful. Please let me know if you need any additional information.

like image 427
Nic Avatar asked Mar 27 '15 23:03

Nic


2 Answers

This will put a new strip to the right of the orignal strip, and to the left of the legend.

library(ggplot2)
library(gtable)
library(grid)

p <- ggplot(mtcars, aes(mpg, wt, colour = factor(vs))) + geom_point()
p <- p + facet_grid(cyl ~ gear)

# Convert the plot to a grob
gt <- ggplotGrob(p)

# Get the positions of the right strips in the layout: t = top, l = left, ...
strip <-c(subset(gt$layout, grepl("strip-r", gt$layout$name), select = t:r))

#  New column to the right of current strip
gt <- gtable_add_cols(gt, gt$widths[max(strip$r)], max(strip$r))  

# Add grob, the new strip, into new column
gt <- gtable_add_grob(gt, 
  list(rectGrob(gp = gpar(col = NA, fill = "grey85", size = .5)),
  textGrob("Number of Cylinders", rot = -90, vjust = .27, 
        gp = gpar(cex = .75, fontface = "bold", col = "black"))), 
        t = min(strip$t), l = max(strip$r) + 1, b = max(strip$b), name = c("a", "b"))

# Add small gap between strips
gt <- gtable_add_cols(gt, unit(1/5, "line"), max(strip$r))

# Draw it
grid.newpage()
grid.draw(gt)

enter image description here

like image 133
Sandy Muspratt Avatar answered Nov 04 '22 21:11

Sandy Muspratt


Turning my comment into an answer with some simple demo data:

This isn't hard (there's even examples in ?facet_grid, though they're towards the bottom).

# generate some nested data
dat = data.frame(x = rnorm(12), y = rnorm(12), class = rep(LETTERS[1:2], each = 6),
                 subclass = rep(letters[1:6], each = 2))

# plot it
ggplot(dat, aes(x, y)) + geom_point() +
    facet_grid(subclass + class ~ .)

You can do this with arbitrarily many factors on either side of the ~!

like image 37
Gregor Thomas Avatar answered Nov 04 '22 22:11

Gregor Thomas