Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ignore outliers in ggplot2 boxplot + faceting + "free" options

How can I adjust my Y axis in order to ignore outliers, like in this post, but in a more challenging case where I have 4 boxplots and a "free faceting" layout?

p <- ggplot(molten.DF,aes(x=class,y=SOC,fill=class)) + geom_boxplot() + facet_grid(layer~.,scales="free",space="free")

As you can see on my figure, considering outliers in the Y axis range make the boxes more difficult to read. No importance if some outliers are still visible in the result, but I would like to really focus on the boxes!

boxplot with free scale, all outliers are (unfortuately) visibles

like image 998
fstevens Avatar asked May 16 '12 12:05

fstevens


1 Answers

It obviously depends on what you consider an outlier. If it's possible for you to calculate it, you can set your ylim at this value an let those points out of the chart.

For example, if you assume that the upper and lower limits are Q3 + 1.5 IQR and Q1 - 1.5 IQR, and this is the way boxplots usually have its outliers limit calculated, you would have:

upper.fence <- quantile(x)[4] + 1.5*IQR(x)
lower.fence <- quantile(x)[2] - 1.5*IQR(x)

Then you can use these limits as the y range of values:

my.ggplot + coord_cartesian(ylim=c(lower.fence, upper.fence))

The procedure you use to get your upper and lower limits can change, but the use of the limits is the same.

like image 194
João Daniel Avatar answered Oct 20 '22 03:10

João Daniel