So I have a three column data frame that has Trials, Ind. Variable, Observation. Something like:
df1<- data.frame(Trial=rep(1:10,5), Variable=rep(1:5, each=10), Observation=rnorm(1:50))
I am trying to plot a 95% conf. Interval around the mean for each trial using a rather inefficient method as follows:
b<-NULL
b$mean<- aggregate(Observation~Variable, data=df1,mean)[,2]
b$sd <- aggregate(Observation~Variable, data=df1,sd)[,2]
b$Variable<- df1$Variable
b$Observation <- df1$Observation
b$ucl <- rep(qnorm(.975, mean=b$mean, sd=b$sd), each=10)
b$lcl <- rep(qnorm(.025, mean=b$mean, sd=b$sd), each=10)
b<- as.data.frame(b)
c <- ggplot(b, aes(Variable, Observation))
c + geom_point(color="red") +
geom_smooth(aes(ymin = lcl, ymax = ucl), data=b, stat="summary", fun.y="mean")
This is inefficient since it duplicates values for ymin, ymax. I've seen the geom_ribbon methods but I would still need to duplicate. However, if I was using any kind of smoothing like glm, the code is much simpler with no duplication. Is there a better way of doing this?
References: 1. R Plotting confidence bands with ggplot 2. Shading confidence intervals manually with ggplot2 3. http://docs.ggplot2.org/current/geom_smooth.html
With this method, I get the same output as with your method. This was inspired by the docs for ggplot. Again, this will be meaningful so long as each x
value has multiple points.
set.seed(1)
df1 <- data.frame(Trial=rep(1:10,5), Variable=rep(1:5, each=10), Observation=rnorm(1:50)) my_ci <- function(x) data.frame(y=mean(x), ymin=mean(x)-2*sd(x), ymax=mean(x)+2*sd(x))
my_ci <- function(x) data.frame(
y=mean(x),
ymin=mean(x) - 2 * sd(x),
ymax=mean(x) + 2 * sd(x)
)
ggplot(df1, aes(Variable, Observation)) + geom_point(color="red") +
stat_summary(fun.data="my_ci", geom="smooth")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With