Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

can one offset jitter points in ggplot boxplot

Tags:

r

ggplot2

boxplot

In a ggplot boxplot, it is easy to use jitter to add the raw data points with varying degrees of jitter. With zero jitter the following code

dat <- data.frame(group=c('a', 'b', 'c'), values = runif(90))

ggplot(dat, aes(group, values)) + 
geom_boxplot(outlier.size = 0) + 
geom_jitter(position=position_jitter(width=0), aes(colour=group), alpha=0.7) + 
ylim(0, 1) + stat_summary(fun.y=mean, shape=3, col='red', geom='point') +
opts(legend.position = "right") + ylab("values") + xlab("group")

produces the plot below.

Is it possible to use zero jitter but add an offset such that the points are in a line but shifted left by 25% of the box width? I tried geom_point with dodge but this generated a jitter.enter image description here

like image 362
user441706 Avatar asked Dec 14 '11 18:12

user441706


People also ask

What does jitter do in Ggplot?

The jitter geom is a convenient shortcut for geom_point(position = "jitter") . It adds a small amount of random variation to the location of each point, and is a useful way of handling overplotting caused by discreteness in smaller datasets.

What is jitter in Boxplot?

The use of jitter is a great technique in dot plots, box plots with dots, and scatter plots. Jitter is a random value (or for our purposes pseudo-random) that is assigned to the dots to separate them so that they aren't plotted directly on top of each other.

What is jitter plot in R?

A jitter plot is a variant of the strip plot with a better view of overlapping data points, used to visualise the distribution of many individual one-dimensional values.

What do Ggplot Boxplots show?

The boxplot compactly displays the distribution of a continuous variable. It visualises five summary statistics (the median, two hinges and two whiskers), and all "outlying" points individually.


1 Answers

If we convert group to numeric and then add an offset, you seem to get your desired output. There is probably a more effective / efficient way, but give this a whirl:

ggplot(dat, aes(group, values)) + 
  geom_boxplot(outlier.size = 0) + 
  geom_point(aes(x = as.numeric(group) + .25, colour=group), alpha=0.7) + 
  ylim(0, 1) + stat_summary(fun.y=mean, shape=3, col='red', geom='point') +
  opts(legend.position = "right") + ylab("values") + xlab("group")

enter image description here

like image 179
Chase Avatar answered Oct 14 '22 00:10

Chase