Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Possible to combine position_jitter with position_dodge?

Tags:

r

ggplot2

I've become quite fond of boxplots in which jittered points are overlain over the boxplots to represent the actual data, as below:

set.seed(7)
l1 <- gl(3, 1, length=102, labels=letters[1:3])
l2 <- gl(2, 51, length=102, labels=LETTERS[1:2]) # Will use this later
y <- runif(102)
d <- data.frame(l1, l2, y)

ggplot(d, aes(x=l1, y=y)) + 
  geom_point(position=position_jitter(width=0.2), alpha=0.5) +
  geom_boxplot(fill=NA) 

enter image description here

(These are particularly helpful when there are very different numbers of data points in each box.)

I'd like to use this technique when I am also (implicitly) using position_dodge to separate boxplots by a second variable, e.g.

ggplot(d, aes(x=l1, y=y, colour=l2)) + 
  geom_point(position=position_jitter(width=0.2), alpha=0.5) +
  geom_boxplot(fill=NA)

enter image description here

However, I can't figure out how to dodge the points by the colour variable (here, l2) and also jitter them.

like image 784
Drew Steen Avatar asked Sep 26 '13 02:09

Drew Steen


2 Answers

Here is an approach that manually performs the jittering and dodging.

# a plot with no dodging or jittering of the points 
dp <- ggplot(d, aes(x=l1, y=y, colour=l2)) + 
  geom_point(alpha=0.5) +
  geom_boxplot(fill=NA)

# build the plot for rendering
foo <- ggplot_build(dp)
# now replace the 'x' values in the data for layer 1 (unjittered and un-dodged points)
# with the appropriately dodged and jittered points
foo$data[[1]][['x']] <- jitter(foo$data[[2]][['x']][foo$data[[1]][['group']]],amount = 0.2)
# now draw the plot (need to explicitly load grid package)
library(grid)
grid.draw(ggplot_gtable(foo))
# note the following works without explicitly loading grid
plot(ggplot_gtable(foo))

enter image description here

like image 67
mnel Avatar answered Oct 05 '22 06:10

mnel


I don't think you'll like it, but I've never found a way around this except to produce your own x values for the points. In this case:

d$l1.num <- as.numeric(d$l1)
d$l2.num <- (as.numeric(d$l2)/3)-(1/3 + 1/6)
d$x <- d$l1.num + d$l2.num

ggplot(d, aes(l1, y, colour = l2)) + geom_boxplot(fill = NA) +
  geom_point(aes(x = x), position = position_jitter(width = 0.15), alpha = 0.5) + theme_bw()

enter image description here

It's certainly a long way from ideal, but becomes routine pretty quickly. If anyone has an alternative solution, I'd be very happy!

like image 28
alexwhan Avatar answered Oct 05 '22 07:10

alexwhan