How can I make geom_boxplot outliers overlay perfectly with jittered geom_points?
For example, I want the outliers from geom_boxplot to be displayed as "cross hairs" over their actual points from geom_point after jittering?
library(ggplot2)
p <- ggplot(mtcars, aes(factor(cyl), mpg)) +
geom_boxplot(outlier.shape=10, outlier.size=8) +
geom_point(aes(factor(cyl), mpg, color=mpg), position="jitter", size=4)
p
Any assistance would be greatly appreciated.
I agree with Didzis that a solution that does exactly what you aim for is going to be fairly involved. To literally do what you suggest would require (I think) that you do both the jittering and the outlier calculation outside of ggplot. If you're flexible about how you highlight the outliers, this is a potentially shorter solution:
id_outliers <- function(x){
q <- quantile(x,c(0.25,0.75))
iqr <- abs(diff(q))
ifelse((x < q[1] - 1.5*iqr) | (x > q[2] + 1.5*iqr),'Outlier','NotOutlier')
}
mtcars <- ddply(mtcars,
.(cyl),
transform,
out = id_outliers(mpg))
p <- ggplot(mtcars, aes(factor(cyl), mpg)) +
geom_boxplot(outlier.colour = NA) +
geom_point(aes(colour = mpg,shape = out),position = "jitter")
This solution will be quite long. Problem is that with position="jitter"
you can't get exact coordinates for points, so need to find workaround.
So take your original plot and save it with ggplot_build()
. First element of data contains information about boxplots. We are interested in column group
and outliers
as it shows which values ggplot assumes as outliers. Save them as separate object.
p <- ggplot(mtcars, aes(factor(cyl), mpg)) +
geom_boxplot(outlier.shape=10, outlier.size=8) +
geom_point(aes(color=mpg), position="jitter", size=4)
gg<-ggplot_build(p)
gg$data[[1]]
ymin lower middle upper ymax outliers notchupper notchlower x PANEL group weight ymin_final
1 21.4 22.80 26.0 30.40 33.9 29.62055 22.37945 1 1 1 1 21.4
2 17.8 18.65 19.7 21.00 21.4 21.10338 18.29662 2 1 2 1 17.8
3 13.3 14.40 15.2 16.25 18.7 10.4, 10.4, 19.2 15.98120 14.41880 3 1 3 1 10.4
ymax_final xmin xmax
1 33.9 0.625 1.375
2 21.4 1.625 2.375
3 19.2 2.625 3.375
xx<-gg$data[[1]][c("group","outliers")]
xx
group outliers
1 1
2 2
3 3 10.4, 10.4, 19.2
Now change group
values to 4,6 and 8 to be the same as cyl
values.
xx$group<-c(4,6,8)
Now merge this new data frame with original mtcars
and save as new data frame. Then apply function to check if particulars mpg
value is listed in outliers
for that cyl
level. Those values (TRUE and FALSE) are saved in column out
.
mtcars.new<-merge(mtcars,xx,by.x="cyl",by.y="group")
mtcars.new$out<-apply(mtcars.new,1,function(x) x$mpg %in% x$outliers)
Use new data frame to plot data. Remove outliers form geom_boxplot()
. Use column out
to determine shape and size of points. With scale_shape_manual()
and scale_size_manual()
adjust appearance.
ggplot(mtcars.new, aes(factor(cyl), mpg)) +
geom_boxplot(outlier.shape = NA) +
geom_point(aes(color=mpg,shape=out,size=out), position="jitter")+
scale_shape_manual(values=c(16,10),guide="none")+
scale_size_manual(values=c(4,8),guide="none")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With