Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R data.table interval

DT is data.table with POSIXct column 'date_and_time'. How do I use data.table syntax to add a 3 levels factor column indicating to which shift each 'date_and_time' instance belong?

DT <- data.table(date_and_time = as.POSIXct(
  c("1997-07-01 23:00",
    "1999-01-01 02:05",
    "2006-01-01 12:00",
    "2009-01-01 22:30",
    "2012-07-01 03:59"), required.components = 5L)
  )
shifts_delimiter = c("00:00", "08:00", "16:00")

required outcome:

> DT
         date_and_time shift
1: 1997-07-01 23:00:00     3
2: 1999-01-01 02:05:00     1
3: 2006-01-01 12:00:00     2
4: 2009-01-01 22:30:00     3
5: 2012-07-01 03:59:00     1
like image 862
Amitai Avatar asked Dec 02 '25 07:12

Amitai


1 Answers

You could use the ITime class that the data.table package provides combined with findInterval

shifts_delimiter = c("00:00", "08:00", "16:00")
DT[, Shift := findInterval(as.ITime(date_and_time), as.ITime(shifts_delimiter))]
DT
#          date_and_time Shift
# 1: 1997-07-01 23:00:00     3
# 2: 1999-01-01 02:05:00     1
# 3: 2006-01-01 12:00:00     2
# 4: 2009-01-01 22:30:00     3
# 5: 2012-07-01 03:59:00     1

Another option is to use integers as an ordered interval and use the built in hour function with findInterval

shifts_delimiter = c(8L, 16L, 24L)
DT[, Shift2 := findInterval(hour(date_and_time), shifts_delimiter) + 1L]
DT
#          date_and_time Shift Shift2
# 1: 1997-07-01 23:00:00     3      3
# 2: 1999-01-01 02:05:00     1      1
# 3: 2006-01-01 12:00:00     2      2
# 4: 2009-01-01 22:30:00     3      3
# 5: 2012-07-01 03:59:00     1      1
like image 100
David Arenburg Avatar answered Dec 04 '25 21:12

David Arenburg