Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

selective jitter of geom_points

Tags:

r

ggplot2

jitter

I have a ggplot where some of the points are overlapping with a few others. I was wondering if there is a way to put the points one above the other. In my case, there are 2 points at most overlapping.

x=c(1,1,2,3,4,4)
y=c('a1','a1','a2','a3','a4','a4')
type = c('A','B','C','A','B','C')

data = as.data.frame(cbind(x,y,type))

ggplot() + geom_point(data = data, aes(x=x,y=y, color = type, fill = type), size = 2, shape = 25)

enter image description here

Here we see that for point x=1 and y=a1 the type A is sitting beneath type B but I ideally want Type B to be shifted vertically by a bit.

If I use jitter, every thing gets displaced, including the points that don't have an overlap.

like image 331
let_there_be_light Avatar asked Jan 27 '23 06:01

let_there_be_light


1 Answers

We can use duplicated or any similar function to detect the overlap, then we can use R indexing with jitter to apply jitter selectively.

I wrote it as a function:

selective_jitter <- function(x, # x = x co-ordinate
                             y, # y = y co-ordinate 
                             g  # g = group
                             ){
  x <- as.numeric(x)
  y <- as.numeric(y)
  a <- cbind(x, y)
  a[duplicated(a)] <- jitter(a[duplicated(a)], amount = .15) # amount could be made a parameter

  final <- cbind(a, g)
  return(final)
}


data <- as.data.frame(selective_jitter(data$x, data$y, data$type))

ggplot() + geom_point(data = data, aes(x=x,y=y, color = g, fill = type), size = 2, shape = 25)

enter image description here

There are a lot of ways to write this differently or to tweak it. For instance, I think a very nice tweak would be to add an optional argument for the amount option of jitter().

Another potential improvement would be to use a caliper to look for (near-) duplicates as well as the exact duplicates (whereas duplicated will just find exact dupes).

Final note - sometimes when I do this I like to use semi-transparent colors rather than jitter. This variation works well only if the number of series (type) is small, so that you can do things like have 1 series in yellow, 1 in blue, and then their overlap would be green (there are existing solutions on Stack Overflow) that demonstrate that if you're interested.

like image 60
Hack-R Avatar answered Feb 03 '23 07:02

Hack-R