Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to add inbetween space in nested boxplots ggplot2

Tags:

r

ggplot2

boxplot

I would like to added a marginal space between groups of box plots by using the stats_summary method.

Here is a small example of my problem

library(ggplot2)
library(reshape2)
data1 <- (lapply(letters[1:5], function(l1) return(matrix(rt(5*3, 1), nrow = 5, ncol = 3, dimnames = list(cat2=letters[6:10], cat3=letters[11:13])))))
names(data1) <- letters[1:5]
data2 <- melt(data1)

customstats <- function(x) {
  xs <- sort(x)
  return(c(ymin=min(x), lower= mean(xs[xs < mean(x)]), middle = mean(x) , upper = mean(xs[xs > mean(x)]), ymax=max(x)))
}

ggplot(data2, aes(x=cat2, y=value, fill=cat3), width=2) + 
  stat_summary(fun.data = customstats, geom = "boxplot", 
    alpha = 0.5, position = position_dodge(1), mapping = aes(fill=cat3))

The result is the following picture. boxplots

I would like to achieve a visual separation for each "cat2" and add a "space" between the group of boxplots (I'm retricted to using the stats_summary since I have a custom statistic). How can I do it?

like image 858
Drey Avatar asked Sep 28 '22 14:09

Drey


1 Answers

I have fixed a similar problem in an ugly (but effective for me) way by creating a dataframe with the same plotting variables as my original data, but with x (or y) positioned or factored that it fits between the two points I want to separate and missing values for y (or x). For your problem, I added the following code and got an image with spacial separation of clusters.

library(plyr)

empties <- data.frame(cat2_orig=unique(data2$cat2)[-length(unique(data2$cat2))])
#no extra space needed between last cluster and edge of plot
empties$cat2 <- paste0(empties$cat2_orig,empties$cat2_orig)
empties$value <- NA


data2_space <- rbind.fill(data2,empties)

ggplot(data2_space, aes(x=cat2, y=value, fill=cat3), width=2) + 
  stat_summary(fun.data = customstats, geom = "boxplot", 
           alpha = 0.5, position = position_dodge(1), mapping =     aes(fill=cat3)) +
#remove tickmarks for non-interesting points on x-axis
  scale_x_discrete(breaks=unique(data2$cat2))

Before & after

enter image description here

like image 179
Heroka Avatar answered Nov 14 '22 04:11

Heroka