I have constructed some box-plots in R and have several outliers. I know that the default criteria to set outlier limits are:
However, I would like outliers classified as values that fall outside of the boundaries:
Is it possible to set this in R?
When reviewing a box plot, an outlier is defined as a data point that is located outside the whiskers of the box plot. For example, outside 1.5 times the interquartile range above the upper quartile and below the lower quartile (Q1 - 1.5 * IQR or Q3 + 1.5 * IQR).
The outliers affect the mean, median, and other percentiles. Because extreme points are highlighted in a box plot, you can easily identify the data points for investigation. You may find that the outliers are errors in your data or you may find that they are unusual for some other reason.
We can remove outliers in R by setting the outlier. shape argument to NA. In addition, the coord_cartesian() function will be used to reject all outliers that exceed or below a given quartile. The y-axis of ggplot2 is not automatically adjusted.
If there are numerous outliers to one side or the other of the box, or the median line does not evenly divide the box, then the population distribution from which the data were sampled may be skewed.
From ?boxplot
range: this determines how far the plot whiskers extend out from the box. If ‘range’ is positive, the whiskers extend to the most extreme data point which is no more than ‘range’ times the interquartile range from the box. A value of zero causes the whiskers to extend to the data extremes.
So set range=3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With