Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a function for checking if a time interval overlaps in a single column and sort by group in R

Tags:

datetime

time

r

I have a large dataset and I'm trying to find where time intervals overlap by group. To complicate things further I'm hoping that the code could be integrated with the 'dplyr' group_by function so the overlapping times don't get confused with other ids.

I've tried using the overlapping function "int_overlaps(int1, int2)" from "lubridate", but this doesn't work for one column. Any other overlapping functions appear to not work with time intervals.

library(lubridate)

id <- c(1,1,1,2,2)
start <-as.POSIXct(c("2017-06-27 09:30:00","2017-06-27 15:30:00",
                     "2017-06-27 14:30:00","2017-06-28 09:30:00","2017-06-28 15:00:00"),tz= "UTC")
end <-as.POSIXct(c("2017-06-27 10:30:00","2017-06-27 17:30:00",
                   "2017-06-27 18:30:00","2017-06-28 10:30:00","2017-06-28 16:00:00"),tz= "UTC")
inter1<- interval(start,end,tz="UTC")

df <- data.frame(id,inter1)

overlap <- c(FALSE,TRUE,TRUE,FALSE,FALSE)

new_df<-data.frame(id,inter1,overlap)
like image 855
Sparky Avatar asked Nov 25 '25 06:11

Sparky


1 Answers

The sample data doesn't have any overlapping periods. The following change was made:

start <-as.POSIXct(c("2017-06-27 09:30:00","2017-06-27 15:30:00",
                     "2017-06-27 14:30:00","2017-06-28 09:30:00","2017-06-28 15:00:00"), tz= "UTC")

Using lead will return NA if it is the last record in a group

library(dplyr)

new_df %>%
  group_by(id) %>%
  arrange(int_start(inter1), .by_group = TRUE) %>%
  mutate(overlap2 = lead(int_start(inter1)) < int_end(inter1))

# A tibble: 5 x 4
# Groups:   id [2]
     id inter1                                           overlap overlap2
  <dbl> <Interval>                                       <lgl>   <lgl>   
1     1 2017-06-27 09:30:00 UTC--2017-06-27 10:30:00 UTC FALSE   FALSE   
2     1 2017-06-27 14:30:00 UTC--2017-06-28 18:30:00 UTC TRUE    TRUE    
3     1 2017-06-27 15:30:00 UTC--2017-06-27 17:30:00 UTC TRUE    NA      
4     2 2017-06-28 09:30:00 UTC--2017-06-28 10:30:00 UTC FALSE   FALSE   
5     2 2017-06-28 15:00:00 UTC--2017-06-28 16:00:00 UTC FALSE   NA 

If needing to compare each row to all rows within the group

library(tidyverse)

new_df %>%
  group_by(id) %>%
  arrange(int_start(inter1), .by_group = TRUE) %>%
  mutate(overlap2 = map_int(inter1, ~ sum(int_overlaps(.x, inter1))) > 1)

# A tibble: 5 x 4
# Groups:   id [2]
     id inter1                                           overlap overlap2
  <dbl> <Interval>                                       <lgl>   <lgl>   
1     1 2017-06-27 09:30:00 UTC--2017-06-27 10:30:00 UTC FALSE   FALSE   
2     1 2017-06-27 14:30:00 UTC--2017-06-28 18:30:00 UTC TRUE    TRUE    
3     1 2017-06-27 15:30:00 UTC--2017-06-27 17:30:00 UTC TRUE    TRUE    
4     2 2017-06-28 09:30:00 UTC--2017-06-28 10:30:00 UTC FALSE   FALSE   
5     2 2017-06-28 15:00:00 UTC--2017-06-28 16:00:00 UTC FALSE   FALSE 
like image 125
manotheshark Avatar answered Nov 27 '25 21:11

manotheshark



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!