I have measurements that have been recorded approximately every 5 minutes:
2012-07-09T05:30:01+02:00 1906.1 1069.2 1093.2 3 1071.0 1905.7
2012-07-09T05:35:02+02:00 1905.7 1069.2 1093.0 0 1071.5 1905.7
2012-07-09T05:40:02+02:00 1906.1 1068.7 1093.2 0 1069.4 1905.7
2012-07-09T05:45:02+02:00 1905.7 1068.4 1093.0 1 1069.6 1905.7
2012-07-09T05:50:02+02:00 1905.7 1068.2 1093.0 4 1073.3 1905.7
The first column is the data's timestamp. The remaining columns are the recorded data.
I need to resample my data so that I have one row every 15 minutes, e.g. something like:
2012-07-09T05:15:00 XX XX XX XX XX XX
2012-07-09T05:30:00 XX XX XX XX XX XX
....
(In addition, there may be gaps in the recorded data and I would like gaps of more than, say, one hour to be replaced with a row of NA
values.)
I can think of several ways to program this by hand, but is there built-in support for doing that kind of stuff in R? I've looked at the different libraries for dealing with timeseries data (zoo
, chron
etc) but couldn't find anything satisfactory.
Linear interpolation works the best when we have many points.
Interpolation is mostly used while working with time-series data because in time-series data we like to fill missing values with previous one or two values. for example, suppose temperature, now we would always prefer to fill today's temperature with the mean of the last 2 days, not with the mean of the month.
They are: Linear Interpolation Method – This method applies a distinct linear polynomial between each pair of data points for curves, or within the sets of three points for surfaces. Nearest Neighbour Method – This method inserts the value of an interpolated point to the value of the most adjacent data point.
You can use approx
or the related approxfun
. If t
is the vector consisting of the timepoints where your data was sampled and if y
is the vector with the data then f <- approxfun(t,y)
creates a function f
that linearly interpolates the data points in between the time points.
Example:
# irregular time points at which data was sampled
t <- c(5,10,15,25,30,40,50)
# measurements
y <- c(4.3,1.2,5.4,7.6,3.2,1.2,3.7)
f <- approxfun(t,y)
# get interpolated values for time points 5, 20, 35, 50
f(seq(from=5,to=50,by=15))
[1] 4.3 6.5 2.2 3.7
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With