Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Rolling window over irregular time series

I have an irregular time series of events (posts) using xts, and I want to calculate the number of events that occur over a rolling weekly window (or biweekly, or 3 day, etc). The data looks like this:

                    postid
2010-08-04 22:28:07    867
2010-08-04 23:31:12    891
2010-08-04 23:58:05    901
2010-08-05 08:35:50    991
2010-08-05 13:28:02   1085
2010-08-05 14:14:47   1114
2010-08-05 14:21:46   1117
2010-08-05 15:46:24   1151
2010-08-05 16:25:29   1174
2010-08-05 23:19:29   1268
2010-08-06 12:15:42   1384
2010-08-06 15:22:06   1403
2010-08-07 10:25:49   1550
2010-08-07 18:58:16   1596
2010-08-07 21:15:44   1608

which should produce something like

                    nposts
2010-08-05 00:00:00     10
2010-08-06 00:00:00      9
2010-08-07 00:00:00      5

for a 2-day window. I have looked into rollapply, apply.rolling from PerformanceAnalytics, etc, and they all assume regular time series data. I tried changing all of the times to just the day the the post occurred and using something like ddply to group on each day, which gets me close. However, a user might not post every day, so the time series will still be irregular. I could fill in the gaps with 0s, but that might inflate my data a lot and it's already quite large.

What should I do?

like image 538
Eric W. Avatar asked May 11 '12 18:05

Eric W.


People also ask

What is rolling window in time-series?

Rolling-window analysis of a time-series model assesses: The stability of the model over time. A common time-series model assumption is that the coefficients are constant with respect to time. Checking for instability amounts to examining whether the coefficients are time-invariant.

When to use rolling window?

You'll typically use rolling calculations when you work with time-series data. Again, a window is a subset of rows that you perform a window calculation on.

What is rolling window backtesting?

In the rolling window backtesting methodology, researchers use a rolling window (or walk-forward) framework, fit/calibrate factors or trade signals based on the rolling window, rebalance the portfolio periodically, and then track the performance over time.

What is rolling window method?

ROLLING WINDOW METHOD Perhaps the most obvious approach is to divide the time horizon into equal non-overlapping windows—and to use the observations in each window to construct an aggregated observation. Unfortunately, this approach may fail to provide sufficient number of observations for a reliable estimate.


1 Answers

Here's a solution using xts:

x <- structure(c(867L, 891L, 901L, 991L, 1085L, 1114L, 1117L, 1151L, 
  1174L, 1268L, 1384L, 1403L, 1550L, 1596L, 1608L), .Dim = c(15L, 1L),
  index = structure(c(1280960887, 1280964672, 1280966285, 
  1280997350, 1281014882, 1281017687, 1281018106, 1281023184, 1281025529, 
  1281050369, 1281096942, 1281108126, 1281176749, 1281207496, 1281215744),
  tzone = "", tclass = c("POSIXct", "POSIXt")), class = c("xts", "zoo"),
  .indexCLASS = c("POSIXct", "POSIXt"), tclass = c("POSIXct", "POSIXt"),
  .indexTZ = "", tzone = "")
# first count the number of observations each day
xd <- apply.daily(x, length)
# now sum the counts over a 2-day rolling window
x2d <- rollapply(xd, 2, sum)
# align times at the end of the period (if you want)
y <- align.time(x2d, n=60*60*24)  # n is in seconds
like image 118
Joshua Ulrich Avatar answered Sep 20 '22 04:09

Joshua Ulrich