I have two categorical factors ('Habitat' and 'Locality'), and one continuous variable (T). 'Habitat' has two level and 'Locality' has eight levels. I want to change the default whiskers to represent the SE, and the median into the mean for each boxplot. Is there a way to do this and taking both of the categorical factors into account when plotting? Many thanks in advance.
This is what I have done with the default setting of boxplot ggplot, showing the first and third quartiles with median intervals.
ggplot(data,aes(x=Locality,y=T)) +
geom_boxplot(aes(fill=interaction(Habitat,Locality),
group=interaction(factor(Habitat),Locality)),
outlier.shape=1,outlier.size=3) +
theme_bw() +
theme(
panel.grid.major=element_blank(),
panel.grid.minor=element_blank(),
axis.line=element_line(colour='black'),
legend.position='none',
axis.text.x=element_text(angle=90,hjust=1,size=12)) +
scale_y_continuous('T') +
xlab('Locality')
Adding error bars (whiskers) with stat_boxplot The default box plot in ggplot doesn't add the error bar lines, but you can add them with stat_boxplot , setting geom = "errorbar" . Note that you can change its width with width .
If we don't have whole data but mean and standard deviation are available then the boxplot can be created by finding all the limits of a boxplot using mean as a measure of central tendency.
R. Output: In order to show mean values in boxplot using ggplot2, we use the stat_summary() function to compute new summary statistics and add them to the plot. We use stat_summary() function with ggplot() function.
Error bars can be added to plots using the arrows() function and changing the arrow head. You can add vertical and horizontal error bars to any plot type. Simply provide the x and y coordinates, and whatever you are using for your error (e.g. standard deviation, standard error).
First write a function that compute the min, mean-1SEM, mean, mean+1SEM, and Max. Then map these 5 values onto a boxplot using stat_summary
.
library(gridExtra)
library(ggplot2)
MinMeanSEMMax <- function(x) {
v <- c(min(x), mean(x) - sd(x)/sqrt(length(x)), mean(x), mean(x) + sd(x)/sqrt(length(x)), max(x))
names(v) <- c("ymin", "lower", "middle", "upper", "ymax")
v
}
g1 <- ggplot(mtcars, aes(factor(am), mpg)) + geom_boxplot() +
ggtitle("Regular Boxplot")
g2 <- ggplot(mtcars, aes(factor(am), mpg)) +
stat_summary(fun.data=MinMeanSEMMax, geom="boxplot", colour="red") +
ggtitle("Boxplot: Min, Mean-1SEM, Mean, Mean+1SEM, Max")
grid.arrange(g1, g2, ncol=2)
I expect that it is possible, but it is also possible to put up a traffic sign that is a red octagon and says "Increased speed limit ahead", I expect that both will be more confusing that helpful. The boxplot has a standard definition of what the parts represent. When a user sees a boxplot they should not have to go through extra mental gymnastics to rethink what the different parts mean. Why not use a different representation if you don't want to represent these standard summaries. The geom_crossbar
or geom_errorbar
function/geoms may be more appropriate for your display (and probably easier to use than trying to modify the boxplot geom).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With