Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Interpolating timeseries

I have two sets of data with different time stamps. One set of data contains calibration data, the other contains sample data. The calibration is much less frequent than the samples.

What I would like to do is interpolate the calibration data (low freq) onto the sample time series (high freq).

sam <- textConnection("time, value
01:00:52, 256
01:03:02, 254
01:05:23, 255
01:07:42, 257
01:10:12, 256")

cal <- textConnection("time, value
01:01:02, 252.3
01:05:15, 249.8
01:10:02, 255.6")

sample <- read.csv(sam)

sample$time <- as.POSIXct(sample$time, format="%H:%M:%S")

calib <- read.csv(cal)

calib$time <- as.POSIXct(calib$time, format="%H:%M:%S")

The big problem (that I see) is that the freq of the data changes randomly.

Have any of you had to do similar things? Is there a chron or zoo function which would do what I want (interpolate low freq data onto higher freq data where both ts are random)?

like image 681
Alex Archibald Avatar asked Oct 25 '12 17:10

Alex Archibald


People also ask

What is interpolation time series?

Interpolation is mostly used while working with time-series data because in time-series data we like to fill missing values with previous one or two values. for example, suppose temperature, now we would always prefer to fill today's temperature with the mean of the last 2 days, not with the mean of the month.

What do you mean by interpolating?

What Is Interpolation? Interpolation is a statistical method by which related known values are used to estimate an unknown price or potential yield of a security. Interpolation is achieved by using other established values that are located in sequence with the unknown value.

How do you interpolate time in Matlab?

Vq = interp2(X,Y,V,Xq,Yq) returns interpolated values of a function of two variables at specific query points using linear interpolation. The results always pass through the original sampling of the function. X and Y contain the coordinates of the sample points.

What is interpolating algorithm?

Interpolation is a method of constructing new data points within the range of a discrete dataset. It is a problem that dates back to ancient civilisations, which were known to use interpolation methods for analysing astronomical data [1].


2 Answers

I would use zoo (or xts) and do it like this:

library(zoo)
# Create zoo objects
zc <- zoo(calib$value, calib$time)    # low freq
zs <- zoo(sample$value, sample$time)  # high freq
# Merge series into one object
z <- merge(zs,zc)
# Interpolate calibration data (na.spline could also be used)
z$zc <- na.approx(z$zc, rule=2)
# Only keep index values from sample data
Z <- z[index(zs),]
Z
#                      zs       zc
# 2012-10-25 01:00:52 256 252.3000
# 2012-10-25 01:03:02 254 251.1142
# 2012-10-25 01:05:23 255 249.9617
# 2012-10-25 01:07:42 257 252.7707
# 2012-10-25 01:10:12 256 255.6000
like image 170
Joshua Ulrich Avatar answered Oct 12 '22 09:10

Joshua Ulrich


You can also use approx function like this and it will be much easier. Just make sure that you are working with data frames. Also, make sure that format of the column in calibration and sample data-set are the same by using as.POSIXct.

 calib <- data.frame(calib); sample <- data.frame(sample)

 IPcal <- data.frame(approx(calib$time,calib$value, xout = sample$time, 
                 rule = 2, method = "linear", ties = mean))

 head(IPcal)

#                x        y
#1 2017-03-22 01:00:52 252.3000
#2 2017-03-22 01:03:02 251.1142
#3 2017-03-22 01:05:23 249.9617
#4 2017-03-22 01:07:42 252.7707
#5 2017-03-22 01:10:12 255.6000

Read more about approx on approxfun documentation.

like image 25
M-- Avatar answered Oct 12 '22 11:10

M--