Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating barplot with standard errors plotted in R

I am trying to find the best way to create barplots in R with standard errors displayed. I have seen other articles but I cannot figure out the code to use with my own data (having not used ggplot before and this seeming to be the most used way and barplot not cooperating with dataframes). I need to use this in two cases for which I have created two example dataframes:

Plot df1 so that the x-axis has sites a-c, with the y-axis displaying the mean value for V1 and the standard errors highlighted, similar to this example with a grey colour. Here, plant biomass should the mean V1 value and treatments should be each of my sites.

Plot df2 in the same way, but so that before and after are located next to each other in a similar way to this, so pre-test and post-test equate to before and after in my example.

x <- factor(LETTERS[1:3])
site <- rep(x, each = 8)
values <- as.data.frame(matrix(sample(0:10, 3*8, replace=TRUE), ncol=1))
df1 <- cbind(site,values)
z <- factor(c("Before","After"))
when <- rep(z, each = 4)
df2 <- data.frame(when,df1)

Apologies for the simplicity for more experienced R users and particuarly those that use ggplot but I cannot apply snippets of code that I have found elsewhere to my data. I cannot even get enough code together to produce a start to a graph so I hope my descriptions are sufficient. Thank you in advance.

like image 616
James White Avatar asked Feb 10 '23 02:02

James White


1 Answers

Something like this?

library(ggplot2)
get.se <- function(y) {
 se <- sd(y)/sqrt(length(y))
 mu <- mean(y)
 c(ymin=mu-se, ymax=mu+se)
}
ggplot(df1, aes(x=site, y=V1)) +
  stat_summary(fun.y=mean, geom="bar", fill="lightgreen", color="grey70")+
  stat_summary(fun.data=get.se, geom="errorbar", width=0.1)

ggplot(df2, aes(x=site, y=V1, fill=when)) +
  stat_summary(fun.y=mean, geom="bar", position="dodge", color="grey70")+
  stat_summary(fun.data=get.se, geom="errorbar", width=0.1, position=position_dodge(width=0.9))

So this takes advantage of the stat_summary(...) function in ggplot to, first, summarize y for given x using mean(...) (for the bars), and then to summarize y for given x using the get.se(...) function for the error-bars. Another option would be to summarize your data prior to using ggplot, and then use geom_bar(...) and geom_errorbar(...).

Also, plotting +/- 1 se is not a great practice (although it's used often enough). You'd be better served plotting legitimate confidence limits, which you could do, for instance, using the built-in mean_cl_normal function instead of the contrived get.se(...). mean_cl_normal returns the 95% confidence limits based on the assumption that the data is normally distributed (or you can set the CL to something else; read the documentation).

like image 82
jlhoward Avatar answered Feb 11 '23 15:02

jlhoward