library(tidyverse)
# all possible points
df <- expand.grid(
y_factor = paste0('factor_',1:5),
x =1:100
)%>%as.tbl
# randomly missing and overlapping points
# every green point has a pink point underneath, and every blue point
# has a green point underneath it.
seed<-1
df_with_overlap<-df%>%
sample_frac(0.5,replace = TRUE)%>%
group_by(y_factor,x)%>%
mutate(n=factor(1:n()))
p<-ggplot(data=df_with_overlap, aes(x=x, y=y_factor, col=n))
p+geom_point()
Dodging horizontally using position_dodge
doesn't work because the data is too crowded on that axis, so some points still overlap and the visualization isn't clear.
p+geom_point(position=position_dodge(width=1))+
ggtitle('position_dodge isnt what Im looking for.
\nx-axis too crowded and points still overlap')
position_jitter
kind of works because I can limit x jitter to 0, and control the degree of y jitter. But the randomness of the jitter makes it less appealing. I can kind of make out the 3 colours when they exist.
p+geom_point(aes(col=n), position=position_jitter(width=0, height=0.05))+
ggtitle('Jitter kind of works.
\nIt would work better if it wasnt random
\nlike position_dodge, but vertical dodging')
Thanks to @aosmith for suggesting ggstance::position_dodgev()
. It's exactly what I was looking for. I increased the oversampling so the effect is more obvious.
df <- expand.grid(
y_factor = paste0('factor_',1:5),
x =1:100
)%>%as.tbl
seed<-1
df_with_overlap<-df%>%
sample_frac(1.5,replace = TRUE)%>%
group_by(y_factor,x)%>%
mutate(n=factor(1:n()))
ggplot(data=df_with_overlap, aes(x=x, y=y_factor, col=n))+
geom_point(position=ggstance::position_dodgev(height=0.3))
I would transform y_factor
to numeric and use continuous y-axis. Trick is to add to "noise" y numeric values by n group.
df_with_overlap <- df_with_overlap %>%
# Transform y factors to numbers
mutate(y_num = as.numeric(y_factor)) %>%
# Add scaling factor by n group
mutate(y_num = y_num + case_when(n == 1 ~ 0,
n == 2 ~ -0.1,
n == 3 ~ 0.1))
# Plot y numeric values
ggplot(df_with_overlap, aes(x, y_num, color = n)) +
geom_point() +
# On y-axis put original labels and no one will notice that it's actually a continuous scale
scale_y_continuous(breaks = 1:5,
labels = levels(df_with_overlap$y_factor)) +
labs(y = "y_factor")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With