Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Standard error bars using stat_summary





The following code produces bar plots with standard error bars using Hmisc, ddply and ggplot:

means_se <- ddply(mtcars,.(cyl),                   function(df) smean.sdl(df$qsec,mult=sqrt(length(df$qsec))^-1)) colnames(means_se) <- c("cyl","mean","lower","upper") ggplot(means_se,aes(cyl,mean,ymax=upper,ymin=lower,group=1)) +    geom_bar(stat="identity") +     geom_errorbar() 

However, implementing the above using helper functions such as mean_sdl seems much better. For example the following code produces a plot with 95% CI error bars:

ggplot(mtcars, aes(cyl, qsec)) +    stat_summary(fun.y = mean, geom = "bar") +    stat_summary(fun.data = mean_sdl, geom = "errorbar") 

My question is how to use the stat_summary implementation for standard error bars. The problem is that to calculate SE you need the number of observations per condition and this must be accessed in mean_sdl's multiplier.

How do I access this information within ggplot? Is there a neat non-hacky solution for this?

like image 534
aleph4 Avatar asked Oct 08 '13 21:10


People also ask

What does Stat_summary do in R?

Description. stat_summary allows for tremendous flexibilty in the specification of summary functions. The summary function can either operate on a data frame (with argument name fun. data ) or on a vector ( fun.

Do you use standard error for error bars?

Use the standard error for the error bars In the second graph, the length of the error bars is the standard error of the mean (SEM). This is harder to explain to a lay audience because it in an inferential statistic.

1 Answers

Well, I can't tell you how to get a multiplier by group into stat_summary.

However, it looks like your goal is to plot means and error bars that represent one standard error from the mean in ggplot without summarizing the dataset before plotting.

There is a mean_se function in ggplot2 that we can use instead of mean_cl_normal from Hmisc. The mean_se function has a multiplier of 1 as the default so we don't need to pass any extra arguments if we want standard error bars.

ggplot(mtcars, aes(cyl, qsec)) +      stat_summary(fun.y = mean, geom = "bar") +      stat_summary(fun.data = mean_se, geom = "errorbar") 

If you want to use the mean_cl_normal function from Hmisc, you have to change the multiplier to 1 so you get one standard error from the mean. The mult argument is an argument for mean_cl_normal. Arguments that you need to pass to the summary function you are using needs to be given as a list to the fun.args argument:

ggplot(mtcars, aes(cyl, qsec)) +      stat_summary(fun.y = mean, geom = "bar") +      stat_summary(fun.data = mean_cl_normal, geom = "errorbar", fun.args = list(mult = 1)) 

In pre-2.0 versions of ggplot2, the argument could be passed directly:

ggplot(mtcars, aes(cyl, qsec)) +    stat_summary(fun.y = mean, geom = "bar") +    stat_summary(fun.data = mean_cl_normal, geom = "errorbar", mult = 1)  
like image 104
aosmith Avatar answered Sep 20 '22 10:09
