Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculate character string "days, hours, minutes, seconds" to numeric total days [duplicate]

Tags:

time

r

lubridate

I have seen a lot of questions relating to formatting times, but none in the particular imported format that I have:

Time <- c(
"22 hours 3 minutes 22 seconds", 
"170 hours 15 minutes 20 seconds", 
"39 seconds", 
"2 days 6 hours 44 minutes 17 seconds", 
"9 hours 54 minutes 36 seconds", 
"357 hours 23 minutes 28 seconds", 
"464 hours 30 minutes 7 seconds", 
"51 seconds", 
"31 hours 39 minutes 2 seconds", 
"355 hours 29 minutes 10 seconds")

Some times contain only "seconds", and others "minutes and seconds", "days, hours, minutes and seconds", "days and seconds", etc. There are also NA values that I need to keep. How can I get this character vector to calculate (i.e., add days, hours, minutes, seconds) numeric total days?

For example:

Time
8.10
19.3
0.68
2.28
48.1
0.00
0.70
0.1
3.2
13.9

Thank you!

EDIT

Old question, but a simple lubridate call does the trick now:

(period_to_seconds(period(time)) / 86400) %>% round(2)

This also does the trick with no packages other than needing %>% for readability:

Time_vec <- mapply(function(tt, to_days) {
  ifelse(grepl(tt, Time), gsub(paste0("^.*?(\\d+) ", tt, ".*$"), "\\1", Time), 0) %>%
    as.numeric() / to_days
    },
  c("day", "hour", "minute", "second"),
  c(1, 24, 1440, 86400)
) %>%
  apply(1, sum) %>% 
  round(2)

In my actual data, only one value was different than the lubridate solution, 0.96 vs 0.97.

like image 329
Tunn Avatar asked Dec 25 '22 09:12

Tunn


1 Answers

again, without packages and a little regex

Time <- c(
  "22 hours 3 minutes 22 seconds", 
  "170 hours 15 minutes 20 seconds", 
  "39 seconds", 
  "6 hours 44 minutes 17 seconds", 
  "9 hours 54 minutes 36 seconds", 
  "357 hours 23 minutes 28 seconds", 
  "464 hours 30 minutes 7 seconds", 
  "51 seconds", 
  "31 hours 39 minutes 2 seconds", 
  "355 hours 29 minutes 10 seconds")

pat <- '(?:(\\d+) hours )?(?:(\\d+) minutes )?(?:(\\d+) seconds)?'
m <- regexpr(pat, Time, perl = TRUE)

m_st <- attr(m, 'capture.start')
m_ln <- attr(m, 'capture.length')

(mm <- mapply(function(x, y) as.numeric(substr(Time, x, y)),
              data.frame(m_st), data.frame(m_st + m_ln - 1)))

(dd <- setNames(data.frame(mm), c('h','m','s')))
#      h  m  s
# 1   22  3 22
# 2  170 15 20
# 3   NA NA 39
# 4    6 44 17
# 5    9 54 36
# 6  357 23 28
# 7  464 30  7
# 8   NA NA 51
# 9   31 39  2
# 10 355 29 10

round(rowSums(dd / data.frame(h = rep(24, nrow(dd)), m = 24 * 60, s = 24 * 60 * 60),
        na.rm = TRUE), 3)
# [1]  0.919  7.094  0.000  0.281  0.413 14.891 19.354  0.001  1.319 14.812
like image 117
rawr Avatar answered Dec 26 '22 23:12

rawr