Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Interpolation of time series data with specific output time

I have database with time data. I want to interpolate the data to mach e specific time step.

Id  Time                    humid   humtemp prtemp  press       t
1   2012-01-21 18:41:50     47.7    14.12   13.870  1005.70     -0.05277778
1   2012-01-21 18:46:43     44.5    15.37   15.100  1005.20     0.02861111
1   2012-01-21 18:51:35     43.2    15.88   15.576  1005.10     0.10972222
1   2012-01-21 18:56:28     42.5    16.17   15.833  1004.90     0.19111111
1   2012-01-21 19:01:21     42.2    16.31   15.986  1004.80     0.27250000
1   2012-01-21 19:06:14     41.8    16.47   16.118  1004.60     0.35388889
1   2012-01-21 19:11:07     41.6    16.51   16.177  1004.60     0.43527778

I want to obtain data with below time step doing interpolation.

    Id                 Time       humid    humtemp prtemp  press        t   
    1   2012-01-21 18:45:00 ....    ...     .....   ....        ....
    1   2012-01-21 18:50:00 ....    
    1   2012-01-21 18:55:00 ....    
    1   2012-01-21 19:00:00 ....    
    1   2012-01-21 19:05:00 ....    
    1   2012-01-21 19:10:00 ....    

I tried with diffrent method but I didn't find the solution. For example I create zoo object.

   z <- zoo(MTS01m,order.by=MTS01m$Time)
   tstart2<-asP("2012-01-21 18:45:00")
   Ts<-1*60
   y <- merge(z, zoo(order.by=seq(tstart2, end(z), by=Ts)))
   xa <- na.approx(y)
   xs <- na.spline(y)

but error occur:

   Errore in approx(x[!na], y[!na], xout, ...) : 
   need at least two non-NA values to interpolate
   Inoltre: Warning message:
   In xy.coords(x, y) : si è prodotto un NA per coercizione

I create a secundary index t that start where I want to have data, but I don't know how to use thid index.

Have you any suggestion?

like image 705
Marco Giuliani Avatar asked Dec 17 '12 18:12

Marco Giuliani


2 Answers

Try this (assuming your time index is POSIXct):

library(zoo)
st <- as.POSIXct("2012-01-21 18:45")
g <- seq(st, end(z), by = "15 min") # grid
na.approx(z, xout = g)

See ?na.approx.zoo for more info.

Note: Since the question did not provide the data in reproducible form we do so here:

Lines <- "Id date Time humid humtemp prtemp press t1
1   2012-01-21 18:41:50     47.7    14.12   13.870  1005.70     -0.05277778
1   2012-01-21 18:46:43     44.5    15.37   15.100  1005.20     0.02861111
1   2012-01-21 18:51:35     43.2    15.88   15.576  1005.10     0.10972222
1   2012-01-21 18:56:28     42.5    16.17   15.833  1004.90     0.19111111
1   2012-01-21 19:01:21     42.2    16.31   15.986  1004.80     0.27250000
1   2012-01-21 19:06:14     41.8    16.47   16.118  1004.60     0.35388889
1   2012-01-21 19:11:07     41.6    16.51   16.177  1004.60     0.43527778"

library(zoo)
z <- read.zoo(text = Lines, header = TRUE, index = 2:3, tz = "")
st <- as.POSIXct("2012-01-21 18:45")
g <- seq(st, end(z), by = "15 min") # grid
na.approx(z, xout = g)

giving:

                    Id    humid  humtemp   prtemp    press            t1
2012-01-21 18:45:00  1 45.62491 14.93058 14.66761 1005.376 -1.501706e-09
2012-01-21 19:00:00  1 42.28294 16.27130 15.94370 1004.828  2.500000e-01
like image 195
G. Grothendieck Avatar answered Nov 20 '22 00:11

G. Grothendieck


You can see the process as follow:

  1. Build a sequence based on the data ranges.
  2. Merge the sequence and the data.
  3. Interpolate the values: constant or linear method.

Creating the dataset:

data1 <- read.table(text="1   2012-01-21 18:41:50     47.7    14.12   13.870  1005.70     -0.05277778
1   2012-01-21 18:46:43     44.5    15.37   15.100  1005.20     0.02861111
1   2012-01-21 18:51:35     43.2    15.88   15.576  1005.10     0.10972222
1   2012-01-21 18:56:28     42.5    16.17   15.833  1004.90     0.19111111
1   2012-01-21 19:01:21     42.2    16.31   15.986  1004.80     0.27250000
1   2012-01-21 19:06:14     41.8    16.47   16.118  1004.60     0.35388889
1   2012-01-21 19:11:07     41.6    16.51   16.177  1004.60     0.43527778",
 col.names=c("Id","date","Time","humid","humtemp","prtemp","press","t1"))
data1$datetime <- strptime(as.character(paste(d$date,d$Time, sep=" ")),"%Y-%m-%d %H:%M:%S")

Library zoo:

library(zoo)

Step 1:

# sequence interval 5 seconds
seq1 <- zoo(order.by=(as.POSIXlt( seq(min(data1$datetime), max(data1$datetime), by=5) )))

Step 2:

mer1 <- merge(zoo(x=data1[4:7],order.by=data1$datetime), seq1)

Step 3:

#Constant interpolation
dataC <- na.approx(mer1, method="constant")

#Linear interpolation
dataL <- na.approx(mer1)

Visualizing

head(dataC)
                    humid humtemp prtemp  press
2012-01-21 18:41:50  47.7   14.12  13.87 1005.7
2012-01-21 18:41:55  47.7   14.12  13.87 1005.7
2012-01-21 18:42:00  47.7   14.12  13.87 1005.7
2012-01-21 18:42:05  47.7   14.12  13.87 1005.7
2012-01-21 18:42:10  47.7   14.12  13.87 1005.7
2012-01-21 18:42:15  47.7   14.12  13.87 1005.7

head(dataL)
                       humid  humtemp   prtemp    press
2012-01-21 18:41:50 47.70000 14.12000 13.87000 1005.700
2012-01-21 18:41:55 47.64539 14.14133 13.89099 1005.691
2012-01-21 18:42:00 47.59078 14.16266 13.91198 1005.683
2012-01-21 18:42:05 47.53618 14.18399 13.93297 1005.674
2012-01-21 18:42:10 47.48157 14.20532 13.95396 1005.666
2012-01-21 18:42:15 47.42696 14.22666 13.97495 1005.657 
like image 3
angelous Avatar answered Nov 20 '22 01:11

angelous