Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to handle mixed precision dates using R as.POSIXct

Tags:

r

I have a list of a series of dates, with mixed precision. Most have the format "1930-02-06T10:00:00", but a few have the format "2130-02-06" which I want to treat as 2130-02-06T00:00:00.

When I use

df$date <- as.POSIXct(df$date,tz=Sys.timezone())

I lose times from the data because some of the datetimes are missing time. I can write a little conversion routine

fixDateTime <- function (s) {
  if(nchar(s) == 10) {
    return (paste(s, "00:00:00"));
  } else {
    return (str_replace(s,"T", " "));
  }
}

and then do

df$DATET <- as.POSIXct(fixDateTime(df$date),tz=Sys.timezone())

But that doesn't work because fixDateTime is actually given an array and I don't know how to adapt for that. I'm not sure which way to try to solve this. (and I'm sure this shows how newbie I am to R)

like image 538
Grahame Grieve Avatar asked Mar 03 '23 04:03

Grahame Grieve


1 Answers

You can work with your fixDateTime function if you use ifelse which can handle vectors instead of if/else which works for scalars. Keeping everything in base R, we can do

fixDateTime <- function (s) {
  ifelse(nchar(s) == 10, paste(s, "00:00:00"), sub("T", " ", s))
}

and then use it in as.POSIXct

as.POSIXct(fixDateTime(x), tz = "UTC")
#[1] "1930-02-06 10:00:00 UTC" "2130-02-06 00:00:00 UTC"

data

x <- c("1930-02-06T10:00:00", "2130-02-06")
like image 70
Ronak Shah Avatar answered Mar 22 '23 23:03

Ronak Shah