Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to spatially separate rug plots from different series

Tags:

r

ggplot2

I'm trying to graphically evaluate distributions (bimodal vs. unimodal) of datasets, in which the number of datapoints per dataset can vary widely. My problem is to indicate numbers of data points, using something like rug plots, but to avoid the problem of having a series with many data points overhwelm a series with only a few points.

Currently I'm working in ggplot2, combining geom_density and geom_rug like so:

# Set up data: 1000 bimodal "b" points; 20 unimodal "a" points
set.seed(0); require(ggplot2)
x <- c(rnorm(500, mean=10, sd=1), rnorm(500, mean=5, sd=1), rnorm(20, mean=7, sd=1))
l <- c(rep("b", 1000), rep("a", 20))
d <- data.frame(x=x, l=l)

ggplot(d, aes(x=x, colour=l)) + geom_density() + geom_rug()

enter image description here

This almost does what I want - but the "a" points get overwhelmed by the "b" points.

I've hacked a solution using geom_point instead of geom_rug:

d$ypos <- NA
d$ypos[d$l=="b"] <- 0
d$ypos[d$l=="a"] <- 0.01

ggplot() + 
  geom_density(data=d, aes(x=x, colour=l)) +
  geom_point(data=d, aes(x=x, y=ypos, colour=l), alpha=0.5)

enter image description here

However this is unsatisfying because the y positions must be adjusted manually. Is there a more automatic way to separate rug plots from different series, for instance using a position adjustment?

like image 459
Drew Steen Avatar asked May 06 '13 16:05

Drew Steen


1 Answers

One way would be to use two geom_rug() calls - one for b, other for a. Then for one geom_rug() set sides="t" to plot them on top.

ggplot(d, aes(x=x, colour=l)) + geom_density() + 
  geom_rug(data=subset(d,l=="b"),aes(x=x)) +
  geom_rug(data=subset(d,l=="a"),aes(x=x),sides="t")

enter image description here

like image 82
Didzis Elferts Avatar answered Nov 03 '22 06:11

Didzis Elferts