Rolling window over irregular time series

Tags:

I have an irregular time series of events (posts) using xts, and I want to calculate the number of events that occur over a rolling weekly window (or biweekly, or 3 day, etc). The data looks like this:

                    postid
2010-08-04 22:28:07    867
2010-08-04 23:31:12    891
2010-08-04 23:58:05    901
2010-08-05 08:35:50    991
2010-08-05 13:28:02   1085
2010-08-05 14:14:47   1114
2010-08-05 14:21:46   1117
2010-08-05 15:46:24   1151
2010-08-05 16:25:29   1174
2010-08-05 23:19:29   1268
2010-08-06 12:15:42   1384
2010-08-06 15:22:06   1403
2010-08-07 10:25:49   1550
2010-08-07 18:58:16   1596
2010-08-07 21:15:44   1608

which should produce something like

                    nposts
2010-08-05 00:00:00     10
2010-08-06 00:00:00      9
2010-08-07 00:00:00      5

for a 2-day window. I have looked into rollapply, apply.rolling from PerformanceAnalytics, etc, and they all assume regular time series data. I tried changing all of the times to just the day the the post occurred and using something like ddply to group on each day, which gets me close. However, a user might not post every day, so the time series will still be irregular. I could fill in the gaps with 0s, but that might inflate my data a lot and it's already quite large.

What should I do?

538

asked May 11 '12 18:05

Eric W.

1 Answers

Here's a solution using xts:

x <- structure(c(867L, 891L, 901L, 991L, 1085L, 1114L, 1117L, 1151L, 
  1174L, 1268L, 1384L, 1403L, 1550L, 1596L, 1608L), .Dim = c(15L, 1L),
  index = structure(c(1280960887, 1280964672, 1280966285, 
  1280997350, 1281014882, 1281017687, 1281018106, 1281023184, 1281025529, 
  1281050369, 1281096942, 1281108126, 1281176749, 1281207496, 1281215744),
  tzone = "", tclass = c("POSIXct", "POSIXt")), class = c("xts", "zoo"),
  .indexCLASS = c("POSIXct", "POSIXt"), tclass = c("POSIXct", "POSIXt"),
  .indexTZ = "", tzone = "")
# first count the number of observations each day
xd <- apply.daily(x, length)
# now sum the counts over a 2-day rolling window
x2d <- rollapply(xd, 2, sum)
# align times at the end of the period (if you want)
y <- align.time(x2d, n=60*60*24)  # n is in seconds

118

answered Sep 20 '22 04:09

Joshua Ulrich

Related questions
                            
                                Use ... to modify a nested list within a functional
                            
                                Capitalize first letter after special characters
                            
                                How to plot parallel coordinates with multiple categorical variables in R
                            
                                How to keep join column unchanged in data.table non-equi join?
                            
                                "Could not find function" in Roxygen examples during CMD check
                            
                                reg.finalizer() in an R package does not execute at the end of an R session
                            
                                Read large csv file from S3 into R
                            
                                Adding a polygon to a scatter plotly while retaining the hover info
                            
                                Strange behaviour for data.frames without column names
                            
                                readLines function with new version of R
                            
                                `ggplot2` axis.text margin with modified scale position
                            
                                Is grouping parallelised in data.table 1.12.0?
                            
                                Default starting values fitting logistic regression with glm
                            
                                Where is the Purrr ~ operator documented?
                            
                                How do I produce a boxplot in ggplot using a matrix
                            
                                Problems using foreach parallelization
                            
                                Creating a 3D histogram with R
                            
                                R setting space between graphs on a multiplot
                            
                                Why does dmy() in the lubridate package not work with NAs? What is a good workaround?
                            
                                Value/reference equality for same named function in package/namespace environments?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Rolling window over irregular time series

Tags:

r

time-series

zoo

xts

Eric W.

People also ask

1 Answers

Joshua Ulrich

Recent Activity

Donate For Us