Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Round time by X hours in R?

While doing predicting modeling on timestamped data, I want to write a function in R (possibly using data.table) that rounds the date by X number of hours. E.g. rounding by 2 hours should give this:

"2014-12-28 22:59:00 EDT" becomes "2014-12-28 22:00:00 EDT" 
"2014-12-28 23:01:00 EDT" becomes "2014-12-29 00:00:00 EDT" 

It's very easy to do when you round by 1 hour - using round.POSIXt(.date, "hour") function.
Writing a generic function, like I'm doing below using multiple if statements, becomes quite ugly however:

d7.dateRoundByHour <- function (.date, byHours) { 

  if (byHours == 1)
    return (round.POSIXt(.date, "hour"))

  hh = hour(.date); dd = mday(.date); mm = month(.date); yy = year(.date)    
  hh = round(hh/byHours,digits=0) * byHours
  if (hh>=24) { 
    hh=0; dd=dd+1 
  }
  if ((mm==2 & dd==28) | 
      (mm %in% c(1,3,5,7,8,10,12) & dd==31) | 
      (mm %in% c(2,4,6,9,11) & dd==30)) {  # NB: it won't work on 29 Feb leap year. 
    dd=1; mm=mm+1
  }
  if (mm==13) {
    mm=1; yy=yy+1
  }
  str = sprintf("%i-%02.0f-%02.0f %02.0f:%02.0f:%02.0f EDT", yy,mm,dd, hh,0,0)
  as.POSIXct(str, format="%Y-%m-%d %H:%M:%S") 
}

Anyone can show a better way to do that?
(perhaps by converting to numeric and back to POSIXt or some other POSIXt functions?)

like image 356
IVIM Avatar asked Dec 23 '22 20:12

IVIM


2 Answers

Use the round_date function from the lubridate package. Assuming you had a data.table with a column named date you could do the following:

dt[, date := round_date(date, '2 hours')]

A quick example will give you exactly the results you were looking for:

x <- as.POSIXct("2014-12-28 22:59:00 EDT")
round_date(x, '2 hours')
like image 85
GarAust89 Avatar answered Dec 27 '22 10:12

GarAust89


This is actually really easy with just base R. The basic idea for round by "odd lots" that you

  • scale down by an appropriate scale factor
  • round down to integer in the downscaled unit
  • scale back up and re-convert

Or in two R code statements:

R> pt <- as.POSIXct(c("2014-12-28 22:59:00", "2014-12-28 23:01:00 EDT"))
R> pt   # just to check
[1] "2014-12-28 22:59:00 CST" "2014-12-28 23:01:00 CST"
R> 
R> scalefactor <- 60*60*2   # 2 hours of 60 minutes times 60 seconds
R> 
R> as.POSIXct(round(as.numeric(pt)/scalefactor) * scalefactor, origin="1970-01-01")
[1] "2014-12-28 22:00:00 CST" "2014-12-29 00:00:00 CST"
R> 

The key last line just does what I outlined: convert the POSIXct to a numeric representation, scales it down, then rounds before scaling back up and converting to a POSIXct again.

like image 21
Dirk Eddelbuettel Avatar answered Dec 27 '22 10:12

Dirk Eddelbuettel