I need to plot a time series with ggplot2. For each point of the time series I also have some quantiles, say 0.05, 0.25, 0.75, 0.95, i.e. I have five data for each point. For example:
time quantile=0.05 quantile=0.25 quantile=0.5 quantile=0.75 quantile=0.95
00:01 623.0725 630.4353 903.8870 959.1407 1327.721
00:02 623.0944 631.3707 911.9967 1337.4564 1518.539
00:03 623.0725 630.4353 903.8870 1170.8316 1431.893
00:04 623.0725 630.4353 903.8870 1336.3212 1431.893
00:05 623.0835 631.3557 905.4220 1079.6623 1452.260
00:06 623.0835 631.3557 905.4220 1079.6623 1452.260
00:07 623.0835 631.3557 905.4220 1079.6623 1452.260
00:08 623.0780 631.3483 905.3496 1056.3719 1375.610
00:09 623.0671 630.4275 903.8839 1170.8196 1356.963
00:10 623.0507 630.0261 741.8475 1006.1208 1462.271
Ideally, I would like to have the 0.5 quantile as a black line and the others as shaded color intervals surrounding the black line. What's the best way to do this? I've been looking around with no luck, I can't find examples of this, even less with ggplot2.
Any help would be appreciated.
Salud!
Does this do what you want? The trick to ggplot
is understanding that it expects data in long format. This often means that we have to transform the data before it is ready to plot, usually with melt()
.
After reading your data in with textConnection()
and creating an object named dat
, here are the steps you'd take:
#Melt into long format
dat.m <- melt(dat, id.vars = "time")
#Not necessary, but if you want different line types depending on quantile, here's how I'd do it
dat.m <- within(dat.m
, lty <- ifelse(variable == "quantile.0.5", 1
, ifelse(variable %in% c("quantile.0.25", "quantile.0.75"),2,3)
)
)
#plot it
ggplot(dat.m, aes(time, value, group = variable, colour = variable, linetype = lty)) +
geom_line() +
scale_colour_manual(name = "", values = c("red", "blue", "black", "blue", "red"))
Gives you:
After reading your question again, maybe you want shaded ribbons outside the median estimate instead of lines? If so, give this a whirl. The only real trick here is that we pass group = 1
as an aesthetic so that geom_line()
will behave properly with factor / character data. Previously, we grouped by the variable which served the same effect. Also note that we are no longer using the melt
ed data.frame, as the wide data.frame will suit us just fine in this case.
ggplot(dat, aes(x = time, group = 1)) +
geom_ribbon(aes(ymin = quantile.0.05, ymax = quantile.0.95, fill = "05%-95%"), alpha = .25) +
geom_ribbon(aes(ymin = quantile.0.25, ymax = quantile.0.75, fill = "25%-75%"), alpha = .25) +
geom_line(aes(y = quantile.0.5)) +
scale_fill_manual(name = "", values = c("25%-75%" = "red", "05%-95%" = "blue"))
Edit: To force a legend for the predicted value
We can use the same approach we used for the geom_ribbon()
layers. We'll add an aesthetic to geom_line()
and then set the values of that aesthetic with scale_colour_manual()
:
ggplot(dat, aes(x = time, group = 1)) +
geom_ribbon(aes(ymin = quantile.0.05, ymax = quantile.0.95, fill = "05%-95%"), alpha = .25) +
geom_ribbon(aes(ymin = quantile.0.25, ymax = quantile.0.75, fill = "25%-75%"), alpha = .25) +
geom_line(aes(y = quantile.0.5, colour = "Predicted")) +
scale_fill_manual(name = "", values = c("25%-75%" = "red", "05%-95%" = "blue")) +
scale_colour_manual(name = "", values = c("Predicted" = "black"))
There may be more efficient ways to do that, but that's the way I've always used and have had pretty good success with it. YMMV.
Assuming your dat.frame is called df
:
The easiest ggplot
solution is to use the boxplot geom. This gives a black central line with filled boxes to the middle and upper positions.
Since you have pre-summarised your data, it is important to specify the stat="identity"
parameter:
ggplot(df, aes(x=time)) +
geom_boxplot(
aes(
lower=quantile.0.25,
upper=quantile.0.75,
middle=quantile.0.5,
ymin=quantile.0.05,
ymax=quantile.0.95
),
stat="identity",
fill = "cyan"
)
PS. I recreated your data as follows:
df <- "time quantile=0.05 quantile=0.25 quantile=0.5 quantile=0.75 quantile=0.95
00:01 623.0725 630.4353 903.8870 959.1407 1327.721
00:02 623.0944 631.3707 911.9967 1337.4564 1518.539
00:03 623.0725 630.4353 903.8870 1170.8316 1431.893
00:04 623.0725 630.4353 903.8870 1336.3212 1431.893
00:05 623.0835 631.3557 905.4220 1079.6623 1452.260
00:06 623.0835 631.3557 905.4220 1079.6623 1452.260
00:07 623.0835 631.3557 905.4220 1079.6623 1452.260
00:08 623.0780 631.3483 905.3496 1056.3719 1375.610
00:09 623.0671 630.4275 903.8839 1170.8196 1356.963
00:10 623.0507 630.0261 741.8475 1006.1208 1462.271"
df <- read.table(textConnection(df), header=TRUE)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With