Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert in both directions between year,month,day and dates in R?

How to convert between year,month,day and dates in R?

I know one can do this via strings, but I would prefer to avoid converting to strings, partly because maybe there is a performance hit?, and partly because I worry about regionalization issues, where some of the world uses "year-month-day" and some uses "year-day-month".

It looks like ISODate provides the direction year,month,day -> DateTime , although it does first converts the number to a string, so if there is a way that doesn't go via a string then I prefer.

I couldn't find anything that goes the other way, from datetimes to numerical values? I would prefer not needing to use strsplit or things like that.

Edit: just to be clear, what I have is, a data frame which looks like:

year month day hour somevalue
2004 1     1   1   1515353
2004 1     1   2   3513535
....

I want to be able to freely convert to this format:

time(hour units) somevalue
1             1515353
2             3513535
....

... and also be able to go back again.

Edit: to clear up some confusion on what 'time' (hour units) means, ultimately what I did was, and using information from How to find the difference between two dates in hours in R?:

forwards direction:

lh$time <- as.numeric( difftime(ISOdate(lh$year,lh$month,lh$day,lh$hour), ISOdate(2004,1,1,0), units="hours"))
lh$year <- NULL; lh$month <- NULL; lh$day <- NULL; lh$hour <- NULL

backwards direction:

... well, I didnt do backwards yet, but I imagine something like:

  • create difftime object out of lh$time (somehow...)
  • add ISOdate(2004,1,1,0) to difftime object
  • use one of the solution below to get the year,month,day, hour back

I suppose in the future, I could ask the exact problem I'm trying to solve, but I was trying to factorize my specific problem into generic reusable questions, but maybe that was a mistake?

like image 813
Hugh Perkins Avatar asked Oct 19 '12 14:10

Hugh Perkins


1 Answers

Because there are so many ways in which a date can be passed in from files, databases etc and for the reason you mention of just being written in different orders or with different separators, representing the inputted date as a character string is a convenient and useful solution. R doesn't hold the actual dates as strings and you don't need to process them as strings to work with them.

Internally R is using the operating system to do these things in a standard way. You don't need to manipulate strings at all - just perhaps convert some things from character to their numerical equivalent. For example, it is quite easy to wrap up both operations (forwards and backwards) in simple functions you can deploy.

toDate <- function(year, month, day) {
    ISOdate(year, month, day)
}

toNumerics <- function(Date) {
    stopifnot(inherits(Date, c("Date", "POSIXt")))
    day <- as.numeric(strftime(Date, format = "%d"))
    month <- as.numeric(strftime(Date, format = "%m"))
    year <- as.numeric(strftime(Date, format = "%Y"))
    list(year = year, month = month, day = day)
}

I forego the a single call to strptime() and subsequent splitting on a separation character because you don't like that kind of manipulation.

> toDate(2004, 12, 21)
[1] "2004-12-21 12:00:00 GMT"
> toNumerics(toDate(2004, 12, 21))
$year
[1] 2004

$month
[1] 12

$day
[1] 21

Internally R's datetime code works well and is well tested and robust if a bit complex in places because of timezone issues etc. I find the idiom used in toNumerics() more intuitive than having a date time as a list and remembering which elements are 0-based. Building on the functionality provided would seem easier than trying to avoid string conversions etc.

like image 82
Gavin Simpson Avatar answered Sep 22 '22 02:09

Gavin Simpson