Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Time data complement with R

Tags:

r

There is the following data frame

> start_time <- c("11:23","16:21","18:57","22:00")
> end_time <- c("11:33","16:23","19:03","22:05")
> vol <- c("red","green","blue","black")
> x <- data.frame(start_time,end_time,vol)
> x
  start_time end_time   vol
1      11:23    11:33   red
2      16:21    16:23 green
3      18:57    19:03  blue
4      22:00    22:05 black

I want to convert this data frame as follows.

> time <- c("11:23","11:24","11:25","16:21","16:22","16:23","16:24","18:57","18:58","18:59","19:00","19:01","22:00","22:01","22:02")
> vol <- c("red","red","red","green","green","green","green","blue","blue","blue","blue","blue","black","black","black")
> y <- data.frame(time,vol)
> y
    time   vol
1  11:23   red
2  11:24   red
3  11:25   red
4  16:21 green
5  16:22 green
6  16:23 green
7  16:24 green
8  18:57  blue
9  18:58  blue
10 18:59  blue
11 19:00  blue
12 19:01  blue
13 22:00 black
14 22:01 black
15 22:02 black

It adds records every 1 minute from start to finish. Is this possible with R? Thank you

like image 846
Tim Ikata Avatar asked May 22 '26 05:05

Tim Ikata


2 Answers

You could convert 'time' columns to POSIXct format, create a 1 mins sequence between start_time and end_time and expand the data.

library(dplyr)

x %>%
  mutate_at(vars(ends_with('_time')), as.POSIXct, format = '%H:%M') %>%
  mutate(time = purrr::map2(start_time, end_time, seq, by = 'min')) %>%
  tidyr::unnest(time) %>%
  mutate(time = format(time, '%H:%M')) %>%
  select(-ends_with('_time'))

# A tibble: 27 x 2
#   vol   time 
#   <chr> <chr>
# 1 red   11:23
# 2 red   11:24
# 3 red   11:25
# 4 red   11:26
# 5 red   11:27
# 6 red   11:28
# 7 red   11:29
# 8 red   11:30
# 9 red   11:31
#10 red   11:32
# … with 17 more rows
like image 92
Ronak Shah Avatar answered May 23 '26 21:05

Ronak Shah


Using expand.grid row-wise. For this columns with numeric start/end_time (i.e. seconds) are helpful. To avoid daylight saving time issues we choose a neutral date like "1970-01-01".

dat[4:5] <- lapply(dat[1:2], function(x) as.numeric(as.POSIXct(paste("1970-01-01", x))))

Then use expand.grid on sequences of 60 seconds of start/end_time and vol on each row, and rbind. Using transform convert back numeric time into readable time.

res <- transform(
  setNames(do.call(rbind, apply(dat, 1, function(x) 
    expand.grid(seq(x[4], x[5], 60), x[3]))), 
    c("time", "vol")),
  time=format(as.POSIXct(time, origin="1970-01-01"), "%H:%M"))
res
#     time   vol
# 1  11:23   red
# 2  11:24   red
# 3  11:25   red
# 4  11:26   red
# 5  11:27   red
# 6  11:28   red
# 7  11:29   red
# 8  11:30   red
# 9  11:31   red
# 10 11:32   red
# 11 11:33   red
# 12 16:21 green
# 13 16:22 green
# 14 16:23 green
# 15 18:57  blue
# 16 18:58  blue
# 17 18:59  blue
# 18 19:00  blue
# 19 19:01  blue
# 20 19:02  blue
# 21 19:03  blue
# 22 22:00 black
# 23 22:01 black
# 24 22:02 black
# 25 22:03 black
# 26 22:04 black
# 27 22:05 black
like image 40
jay.sf Avatar answered May 23 '26 22:05

jay.sf