Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to analyse irregular time-series in R

Tags:

r

time-series

zoo

I have a zoo time series in R:

d <- structure(c(50912, 50912, 50912, 50912, 50913, 50913, 50914, 
50914, 50914, 50915, 50915, 50915, 50916, 50916, 50916, 50917, 
50917, 50917, 50918, 50918, 2293.8, 2302.64, 2310.5, 2324.02, 
2312.25, 2323.93, 2323.83, 2338.67, 2323.1, 2320.77, 2329.73, 
2319.63, 2330.86, 2323.38, 2322.92, 2317.71, 2322.76, 2286.64, 
2294.83, 2305.06, 55.9, 62.8, 66.4, 71.9, 59.8, 65.7, 61.9, 67.9, 
38.5, 36.7, 43.2, 30.3, 42.4, 33.5, 48.8, 52.7, 61.2, 30, 41.7, 
50, 8.6, 9.7, 10.3, 11.1, 9.2, 10.1, 9.6, 10.4, 5.9, 5.6, 6.6, 
4.7, 6.5, 5.2, 7.5, 8.1, 9.5, 4.6, 6.4, 7.7, 9.29591864400155, 
10.6585128174944, 10.4386464748912, 11.5738448647708, 10.9486074772952, 
10.9546547052814, 10.3733963771546, 9.15627378048238, 8.22993822910891, 
5.69045896511178, 6.95269658370746, 7.78781665368086, 7.20089569039135, 
4.9759716583555, 8.99378907920762, 10.0924594632635, 10.3909638115674, 
6.28203685114275, 9.16021859457356, 7.56829801052175, 0.695918644001553, 
0.9585128174944, 0.138646474891241, 0.473844864770827, 1.74860747729523, 
0.854654705281426, 0.773396377154565, -1.24372621951762, 2.32993822910891, 
0.0904589651117833, 0.352696583707458, 3.08781665368086, 0.700895690391349, 
-0.224028341644497, 1.49378907920762, 1.99245946326349, 0.890963811567351, 
1.68203685114275, 2.76021859457356, -0.131701989478247), .Dim = c(20L, 
6L), .Dimnames = list(NULL, c("station_id", "ztd", "zwd", "iwv", 
"radiosonde", "error")), index = structure(c(892094400, 892116000, 
892137600, 892159200, 892180800, 892245600, 892267200, 892288800, 
892332000, 892353600, 892375200, 892418400, 892440000, 892461600, 
892504800, 892526400, 892548000, 892591200, 892612800, 892634400
), class = c("POSIXct", "POSIXt")), class = "zoo")

I want to perform some of the analyses that the ts package allows me to do, such as decomposing the time-series into the trend and seasonality, and looking at the auto-correlation function. However, trying to do any of these gives an error of: Error in na.fail.default(as.ts(x)) : missing values in object.

Looking into this in more depth, it seems that all of these functions work on ts objects that have, by definition, regularly-spaced observations. My observations aren't, so I end up with a lot of NAs and everything fails.

Is there a way to analyse the irregular time-series in R? Or do I need to convert them to be regular somehow? If so, is there a simple way to do this?

like image 778
robintw Avatar asked Sep 27 '12 13:09

robintw


People also ask

Does data collected irregularly count as time series?

Data collected irregularly or only once are not time series. An observed time series can be decomposed into three components: the trend (long term direction), the seasonal (systematic, calendar related movements) and the irregular (unsystematic, short term fluctuations).

How do you find the trend and seasonality of a time series data in R?

To estimate the trend component and seasonal component of a seasonal time series that can be described using an additive model, we can use the “decompose()” function in R. This function estimates the trend, seasonal, and irregular components of a time series that can be described using an additive model.

How do you forecast in Holtwinters in R?

Using the HoltWinter functions in R is pretty straightforward. Now I pass the timeseries object to HoltWinter and plot the fitted data. Next, we calculate the forecast for 12 months with a confidence interval of . 95 and plot the forecast together with the actual and fitted values.


1 Answers

I have analysed such irregular data in the past using an additive model to "decompose" the seasonal and trend components. As this is a regression-based approach you need to model the residuals as a time series process to account for lack of independence in the residuals.

I used the mgcv package for these analysis. Essentially the model fitted is:

require(mgcv)
require(nlme)
mod <- gamm(response ~ s(dayOfYear, bs = "cc") + s(timeOfSampling), data = foo,
            correlation = corCAR1(form = ~ timeOfSampling))

Which fits a cyclic spline in the day of the year variable dayOfYear for the seasonal term and the trend is represented by timeOfSampling which is a numeric variable. The residuals are modelled here as a continuous-time AR(1) using the timeOfSampling variable as the time component of the CAR(1). This assumes that with increasing temporal separation, the correlation between residuals drops off exponentially.

I have written some blog posts on some of these ideas:

  1. Smoothing temporally correlated data
  2. Additive modelling and the HadCRUT3v global mean temperature series

which contain additional R code for you to follow.

like image 105
Gavin Simpson Avatar answered Oct 05 '22 20:10

Gavin Simpson