I want to produce seasonal boxplots for a lot of different time series. I hope that the code below clearly illustrates what I want to do.
My question is now, how to do this in the most elegant way with as few lines of code as possible. I can create an new object for each month with the function "subset" and then plot it, but this seems to be not very elegant. I tried to use the "split" function, but I don't know, how to proceed from there.
Please tell me if my question is not clearly stated or edit it to make it clearer.
Any direct help or linkage to other websites/posts is greatly appreciated. Thanks for your time.
Here is the code:
## Create Data
Time <- seq(as.Date("2003/8/6"), as.Date("2011/8/5"), by = "2 weeks")
data <- rnorm(209, mean = 15, sd = 1)
DF <- data.frame(Time = Time, Data = data)
DF[,3] <- as.numeric(format(DF$Time, "%m"))
colnames(DF)[3] <- "Month"
## Create subsets
Jan <- subset(DF, Month == 1)
Feb <- subset(DF, Month == 2)
Mar <- subset(DF, Month == 3)
Apr <- subset(DF, Month == 4)
## Create boxplot
months <- c("Jan", "Feb", "Mar", "Apr")
boxplot(Jan$Data, Feb$Data, Mar$Data, Apr$Data, ylab = "Data", xlab = "Months", names = months)
## Try with "split" function
DF.split <- split(DF, DF$Month)
head(DF.split)
Using 'ggplot2' (and @James' month names, thanks!):
DF$month <- factor(strftime(DF$Time,"%b"),levels=month.abb)
ggplot(DF, aes(x=,month, y=Data)) +
geom_boxplot()
(BTW: note that in 'ggplot2' "The upper and lower "hinges" correspond to the first and third quartiles (the 25th and 7th percentiles). This differs slightly from the method used by the boxplot function, and may be apparent with small samples." - see documentation)
You are better off picking out the month names directly with the "%b"
format and using an ordered factor and the formula interface for boxplot
:
DF$month <- factor(strftime(DF$Time,"%b"),levels=month.abb)
boxplot(Data~month,DF)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With