Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

extract time from character vector with multiple timezones R

Tags:

r

I have some data with the date, time and timezone as a character vector:

times <- c("2020:03:04 13:31:45+11:00", "2020:03:06 13:28:45+11:00")

I want to convert these to a vector of POSIXct objects without having to manually input the timezone. If I use lubridate to create the POSIXct are converted to UTC meaning I have the wrong time:

library(lubridate)
ymd_hms(times)
[1] "2020-03-04 02:31:45 UTC" "2020-03-06 02:28:45 UTC"

This could be corrected if I used tz

ymd_hms(times, tz = 'Australia/Sydney')
[1] "2020-03-04 13:31:45 AEDT" "2020-03-06 13:28:45 AEDT"
gsub(".*[+-]", "", times)
[1] "11:00" "11:00"

However, I want to automate the process for datasets with multiple timezones such as:

times <- c("2020:03:04 13:31:45+11:00", "2020:03:06 13:28:45-06:00")

My current attempt at a work around is trying to extract the timezone offset and adding that value back to the POSIXct vector but I haven't been successful in retrieving the +/- symbol...:

sub(".*[+-]", "\\1", times)
[1] "11:00" "06:00"
like image 469
Chris Avatar asked Oct 15 '22 05:10

Chris


1 Answers

One trick would be to get the offset and time value separately and then combine them together.

library(lubridate)

times <- c("2020:03:04 13:31:45+11:00", "2020:03:06 13:28:45-06:00")

offset <- sub(".*([+-].*)", "\\1", times)
offset
#[1] "+11:00" "-06:00"

only_times <- sub('[+-].*', '', times)
only_times
#[1] "2020:03:04 13:31:45" "2020:03:06 13:28:45"

ymd_hms(only_times) - hm(offset)
#[1] "2020-03-04 02:31:45 UTC" "2020-03-06 19:28:45 UTC"
like image 190
Ronak Shah Avatar answered Oct 18 '22 14:10

Ronak Shah