Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dodge or jitter only at overlaps in ggplot2

I wold like to make a plot in ggplot2 with the values aligned to the center but dodged at the overlaps, like this one (done in graphpad prism)

plot in graphpad prism .

What I can do in ggplot2 with jitter looks like this

df<-data.frame(response=c(-0.3502294,0.4207441,0.1001638,-0.2401336,-0.2604142,0.4574286,
       0.755964,0.9241669,0.8212376,2.037581,0.6440635,0.2714898,1.433149,0.4627742,
       0.5639637,0.1610219,0.1516505,-1.322015,-0.2134711,0.8554756,0.400872,1.344739,
       0.3743637,0.6329151,0.1467015,0.6313575,0.3989693,0.1940468,-0.06594919,-0.1951204),
    group=c(rep("A",10),rep("B",10),rep("C",10)))

set.seed(1234)
ggplot(df,aes(group,response,fill=group))+
  geom_point(size=5,pch=21,position=position_jitter(w=.2))+
  scale_y_continuous(limits=c(-2,3))

plot in R with ggplot2 .

How can I keep the points aligned to the center and dodge only at the overlaps in ggplot2?

like image 903
RBA Avatar asked Mar 15 '16 07:03

RBA


People also ask

How do you avoid overlapping points in ggplot2?

To avoid overlapping labels in ggplot2, we use guide_axis() within scale_x_discrete().

What does jitter do in Ggplot?

The jitter geom is a convenient shortcut for geom_point(position = "jitter") . It adds a small amount of random variation to the location of each point, and is a useful way of handling overplotting caused by discreteness in smaller datasets.

What does Dodge do in R?

Dodging preserves the vertical position of an geom while adjusting the horizontal position.


1 Answers

My solution for your problem is to:

  1. Divide data into overlapping and non-overlapping points. This can be done by calculating the difference between previous points. If it is less than some threshold, then that point is overlapping with some point and so some zitter should be applied, while plotting that point. Otherwise the point is plotted as such.
  2. Plot the two data separately using geom_point.

Below is the code, using above logic.

library(data.table)

threshold <- 0.1

df<-data.frame(response=c(-0.3502294,0.4207441,0.1001638,-0.2401336,-0.2604142,0.4574286,
       0.755964,0.9241669,0.8212376,2.037581,0.6440635,0.2714898,1.433149,0.4627742,
       0.5639637,0.1610219,0.1516505,-1.322015,-0.2134711,0.8554756,0.400872,1.344739,
       0.3743637,0.6329151,0.1467015,0.6313575,0.3989693,0.1940468,-0.06594919,-0.1951204),
    group=c(rep("A",10),rep("B",10),rep("C",10)))

df1 <- data.table(df[order(df$group, df$response),])
df1[, diffFromLast:=response - shift(response, n=1, type="lag"), by=group]
nonZitterPoints <- df1[ is.na(diffFromLast) | (diffFromLast > threshold),]
zitterPoints <- df1[ which(diffFromLast < threshold),]

g1 <- ggplot()+
       geom_point(data=nonZitterPoints,aes(group,response,fill=group), size=5,pch=21)+
       scale_y_continuous(limits=c(-2,3))
g1

g2 <- g1 + geom_point(data=zitterPoints,aes(group,response,fill=group), size=5,pch=21, position=position_jitter(w=.2))

g2
like image 178
Kumar Manglam Avatar answered Sep 20 '22 17:09

Kumar Manglam