Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lag value with dates

Tags:

r

data.table

lag

I am studying the price of a product along time. I have daily data with some missing info at random.

See here a minimal example where info for the 4th of January is missing:

library(lubridate)
library(data.table)

mockData <- data.table(timeStamp=c(ymd("20180101"), ymd("20180102"), ymd("20180103"), ymd("20180105")),
                       price=c(10,15,12,11))

I want to add the lagged price to my data.table but if the previous day is missing, I want a NA instead of the closest day with info.

I explain myself:

If I use the shift function:

mockData[, lag_price:=shift(price,type="lag")]

I get:

structure(list(timeStamp = structure(c(17532, 17533, 17534, 17536
), class = "Date"), price = c(10, 15, 12, 11), lag_price = c(NA, 
                                                             10, 15, 12)), row.names = c(NA, -4L), class = c("data.table", 
                                                                                                             "data.frame"))

But what I really want is this:

structure(list(timeStamp = structure(c(17532, 17533, 17534, 17536
), class = "Date"), price = c(10, 15, 12, 11), lag_price = c(NA, 
                                                             10, 15, NA)), row.names = c(NA, -4L), class = c("data.table", 
                                                                                                             "data.frame"))

I fell more comfortable using data.table but I will work with data.frame, dplyr and tidyverse if required

like image 494
LocoGris Avatar asked Jun 10 '26 13:06

LocoGris


2 Answers

You could add an ifelse statement to check for consecutive days

mockData[, lag_price := ifelse(timeStamp - shift(timeStamp) == 1, shift(price), NA)]
#    timeStamp price lag_price
#1: 2018-01-01    10        NA
#2: 2018-01-02    15        10
#3: 2018-01-03    12        15
#4: 2018-01-05    11        NA
like image 109
Maurits Evers Avatar answered Jun 12 '26 11:06

Maurits Evers


mockData[, v := 
  data.table(timeStamp = timeStamp + 1, price)[.SD, on=.(timeStamp), x.price]
]

    timeStamp price  v
1: 2018-01-01    10 NA
2: 2018-01-02    15 10
3: 2018-01-03    12 15
4: 2018-01-05    11 NA

This uses a table with (timeStamp + 1, price) for an update join.

like image 32
Frank Avatar answered Jun 12 '26 11:06

Frank



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!