Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the most elegant way to split data and produce seasonal boxplots?

I want to produce seasonal boxplots for a lot of different time series. I hope that the code below clearly illustrates what I want to do.

My question is now, how to do this in the most elegant way with as few lines of code as possible. I can create an new object for each month with the function "subset" and then plot it, but this seems to be not very elegant. I tried to use the "split" function, but I don't know, how to proceed from there.

Please tell me if my question is not clearly stated or edit it to make it clearer.

Any direct help or linkage to other websites/posts is greatly appreciated. Thanks for your time.

Here is the code:

## Create Data
Time <- seq(as.Date("2003/8/6"), as.Date("2011/8/5"), by = "2 weeks")
data <- rnorm(209, mean = 15, sd = 1)
DF <- data.frame(Time = Time, Data = data)
DF[,3] <- as.numeric(format(DF$Time, "%m"))
colnames(DF)[3] <- "Month"

## Create subsets
Jan <- subset(DF, Month == 1)
Feb <- subset(DF, Month == 2)
Mar <- subset(DF, Month == 3)
Apr <- subset(DF, Month == 4)

## Create boxplot
months <- c("Jan", "Feb", "Mar", "Apr")
boxplot(Jan$Data, Feb$Data, Mar$Data, Apr$Data, ylab = "Data", xlab = "Months", names = months)

## Try with "split" function
DF.split <- split(DF, DF$Month)
head(DF.split)
like image 423
Strohmi Avatar asked Aug 21 '12 09:08

Strohmi


2 Answers

Using 'ggplot2' (and @James' month names, thanks!):

DF$month <- factor(strftime(DF$Time,"%b"),levels=month.abb)
ggplot(DF, aes(x=,month, y=Data)) +
    geom_boxplot()

boxplot

(BTW: note that in 'ggplot2' "The upper and lower "hinges" correspond to the first and third quartiles (the 25th and 7th percentiles). This differs slightly from the method used by the boxplot function, and may be apparent with small samples." - see documentation)

like image 128
ROLO Avatar answered Sep 20 '22 10:09

ROLO


You are better off picking out the month names directly with the "%b" format and using an ordered factor and the formula interface for boxplot:

DF$month <- factor(strftime(DF$Time,"%b"),levels=month.abb)
boxplot(Data~month,DF)

enter image description here

like image 27
James Avatar answered Sep 21 '22 10:09

James