I have a file which contains time-series data for multiple variables from a to k.
I would like to create a graph that plots the average of the variables a to k over time and above and below that average line adds a smoothed area representing maximum and minimum variation on each day.
So something like confidence intervals but in a smoothed version.
Here's the dataset: https://dl.dropbox.com/u/22681355/co.csv
and here's the code I have so far:
library(ggplot2)
library(reshape2)
meltdf <- melt(df,id="Year")
ggplot(meltdf,aes(x=Year,y=value,colour=variable,group=variable)) + geom_line()
This depicts bootstrapped 95 % confidence intervals:
ggplot(meltdf,aes(x=Year,y=value,colour=variable,group=variable)) +
stat_summary(fun.data = "mean_cl_boot", geom = "smooth")
This depicts the mean of all values of all variables +-1SD:
ggplot(meltdf,aes(x=Year,y=value)) +
stat_summary(fun.data ="mean_sdl", mult=1, geom = "smooth")
You might want to calculate the year means before calculating the means and SD over the variables, but I leave that to you.
However, I believe a boostrap confidence interval would be more sensible, since the distribution is clearly not symmetric. It would also be narrower. ;)
And of course you could log-transform your values.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With