I am interested in calculating averages over specific time periods in a time series data set.
Given a time series like this:
dtm=as.POSIXct("2007-03-27 05:00", tz="GMT")+3600*(1:240)
Count<-c(1:240)
DF<-data.frame(dtm,Count)
In the past I have been able to calculate daily averages with
DF$Day<-cut(DF$dtm,breaks="day")
Day_Avg<-aggregate(DF$Count~Day,DF,mean)
But now I am trying to cut up the day into specific time periods and I'm not sure how to set my "breaks".
As opposed to a daily average from 0:00:24:00, How for example could I get a Noon to Noon average?
Or more fancy, how could I set up a Noon to Noon average excluding the night times of 7PM to 6AM (or conversely only including the daylight hours of 6AM- 7PM).
xts
is perfect package for timeseries analysis
library(xts)
originalTZ <- Sys.getenv("TZ")
Sys.setenv(TZ = "GMT")
data.xts <- as.xts(1:240, as.POSIXct("2007-03-27 05:00", tz = "GMT") + 3600 * (1:240))
head(data.xts)
## [,1]
## 2007-03-27 06:00:00 1
## 2007-03-27 07:00:00 2
## 2007-03-27 08:00:00 3
## 2007-03-27 09:00:00 4
## 2007-03-27 10:00:00 5
## 2007-03-27 11:00:00 6
# You can filter data using ISO-style subsetting
data.xts.filterd <- data.xts["T06:00/T19:00"]
# You can use builtin functions to apply any function FUN on daily data.
apply.daily(data.xts.filtered, mean)
## [,1]
## 2007-03-27 18:00:00 7.5
## 2007-03-28 18:00:00 31.5
## 2007-03-29 18:00:00 55.5
## 2007-03-30 18:00:00 79.5
## 2007-03-31 18:00:00 103.5
## 2007-04-01 18:00:00 127.5
## 2007-04-02 18:00:00 151.5
## 2007-04-03 18:00:00 175.5
## 2007-04-04 18:00:00 199.5
## 2007-04-05 18:00:00 223.5
# OR
# now let's say you want to find noon to noon average.
period.apply(data.xts, c(0, which(.indexhour(data.xts) == 11)), FUN = mean)
## [,1]
## 2007-03-27 11:00:00 3.5
## 2007-03-28 11:00:00 18.5
## 2007-03-29 11:00:00 42.5
## 2007-03-30 11:00:00 66.5
## 2007-03-31 11:00:00 90.5
## 2007-04-01 11:00:00 114.5
## 2007-04-02 11:00:00 138.5
## 2007-04-03 11:00:00 162.5
## 2007-04-04 11:00:00 186.5
## 2007-04-05 11:00:00 210.5
# now if you want to exclude time from 7 PM to 6 AM
data.xts.filtered <- data.xts[!data.xts %in% data.xts["T20:00/T05:00"]]
head(data.xts.filtered, 20)
## [,1]
## 2007-03-27 06:00:00 1
## 2007-03-27 07:00:00 2
## 2007-03-27 08:00:00 3
## 2007-03-27 09:00:00 4
## 2007-03-27 10:00:00 5
## 2007-03-27 11:00:00 6
## 2007-03-27 12:00:00 7
## 2007-03-27 13:00:00 8
## 2007-03-27 14:00:00 9
## 2007-03-27 15:00:00 10
## 2007-03-27 16:00:00 11
## 2007-03-27 17:00:00 12
## 2007-03-27 18:00:00 13
## 2007-03-27 19:00:00 14
## 2007-03-28 06:00:00 25
## 2007-03-28 07:00:00 26
## 2007-03-28 08:00:00 27
## 2007-03-28 09:00:00 28
## 2007-03-28 10:00:00 29
## 2007-03-28 11:00:00 30
period.apply(data.xts.filtered, c(0, which(.indexhour(data.xts.filtered) == 11)), FUN = mean)
## [,1]
## 2007-03-27 11:00:00 3.50000
## 2007-03-28 11:00:00 17.78571
## 2007-03-29 11:00:00 41.78571
## 2007-03-30 11:00:00 65.78571
## 2007-03-31 11:00:00 89.78571
## 2007-04-01 11:00:00 113.78571
## 2007-04-02 11:00:00 137.78571
## 2007-04-03 11:00:00 161.78571
## 2007-04-04 11:00:00 185.78571
## 2007-04-05 11:00:00 209.78571
Sys.setenv(TZ = originalTZ)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With