Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Changing date time format in R

Tags:

r

I have a table with two columns "start" and "end" containing both dates and times of the respective start and end period as follows:

Sr. No.    Start                 End
1          22May2001:00:00:00    27May2001:23:59:59
2          28May2001:00:00:00    26Jun2001:23:59:59

I would like to convert above date time in the following format (ISO8601 with time stamp):

Sr. No.    Start                 End
1          2001-05-22 00:00:00   2001-05-27 23:59:59
2          2001-05-28 00:00:00   2001-06-26 23:59:59

I have used the code available at this link: http://www.stat.berkeley.edu/~s133/dates.html

View(my_table)
str(my_table)
my_table$startD <- as.Date(my_table$start, "%d%b%Y:%H:%M:%S")
my_table$startT <- strptime(my_table$start, format = "%d%b%Y:%H:%M:%S")

So far, my attempt gave me two columns like this:

StartD       StartT
2001-05-22   2001-05-22

Which is not desirable. Could someone please suggest me to convert the date time in desired format through above or any alternate approach?

like image 721
khajlk Avatar asked Jul 26 '16 15:07

khajlk


2 Answers

In answer form for clarity, you need a datetime class, which in R means either POSIXct or POSIXlt. Usually we use as.POSIXct and strptime for parsing strings into each class, respectively (as.POSIXlt exists, but rarely gets used), though there are lubridate alternatives if you like.

At its most basic,

my_table$Start <- as.POSIXct(my_table$Start, format = '%d%b%Y:%H:%M:%S')
my_table$End <- as.POSIXct(my_table$End, format = '%d%b%Y:%H:%M:%S')

my_table
##   Sr.No.      Start                 End
## 1      1 2001-05-22 2001-05-27 23:59:59
## 2      2 2001-05-28 2001-06-26 23:59:59

Note you need to specify the name of the format string, as the second parameter of as.POSIXct is actually tz (for setting the time zone). Also note that while Start looks like it's missing a time, that's because the print methods for POSIX*t don't print times at midnight, though they are still stored.

If you'd like to change both in a single line, you could use

my_table[-1] <- lapply(my_table[,-1], as.POSIXct, format = '%d%b%Y:%H:%M:%S')

or in dplyr (which prefers POSIXct over POSIXlt):

library(dplyr)

my_table %>% mutate_at(-1, as.POSIXct, format = '%d%b%Y:%H:%M:%S')

both of which return exactly the same thing. You could also use lubridate::dmy_hms, which parses to POSIXct:

library(lubridate)

my_table$Start <- dmy_hms(my_table$Start)    # or lapply like above
my_table$End <- dmy_hms(my_table$End)

# or dplyr
my_table %>% mutate_at(-1, dmy_hms)

which also return the same thing.


Data

my_table <- structure(list(Sr.No. = 1:2, Start = structure(1:2, .Label = c("22May2001:00:00:00", 
    "28May2001:00:00:00"), class = "factor"), End = structure(c(2L, 
    1L), .Label = c("26Jun2001:23:59:59", "27May2001:23:59:59"), class = "factor")), .Names = c("Sr.No.", 
    "Start", "End"), class = "data.frame", row.names = c(NA, -2L))
like image 52
alistaire Avatar answered Sep 19 '22 23:09

alistaire


Hope this helps.

my_table <- "22May2001:22:02:50"

my_table <- strptime(as.character(my_table), "%d%b%Y:%H:%M:%S")

my_table <- format(my_table, "%Y-%m-%d %H:%M:%S")

str(my_table)
like image 45
Trigs Avatar answered Sep 20 '22 23:09

Trigs