I have a large dataset and I'm trying to find where time intervals overlap by group. To complicate things further I'm hoping that the code could be integrated with the 'dplyr' group_by function so the overlapping times don't get confused with other ids.
I've tried using the overlapping function "int_overlaps(int1, int2)" from "lubridate", but this doesn't work for one column. Any other overlapping functions appear to not work with time intervals.
library(lubridate)
id <- c(1,1,1,2,2)
start <-as.POSIXct(c("2017-06-27 09:30:00","2017-06-27 15:30:00",
"2017-06-27 14:30:00","2017-06-28 09:30:00","2017-06-28 15:00:00"),tz= "UTC")
end <-as.POSIXct(c("2017-06-27 10:30:00","2017-06-27 17:30:00",
"2017-06-27 18:30:00","2017-06-28 10:30:00","2017-06-28 16:00:00"),tz= "UTC")
inter1<- interval(start,end,tz="UTC")
df <- data.frame(id,inter1)
overlap <- c(FALSE,TRUE,TRUE,FALSE,FALSE)
new_df<-data.frame(id,inter1,overlap)
The sample data doesn't have any overlapping periods. The following change was made:
start <-as.POSIXct(c("2017-06-27 09:30:00","2017-06-27 15:30:00",
"2017-06-27 14:30:00","2017-06-28 09:30:00","2017-06-28 15:00:00"), tz= "UTC")
Using lead will return NA if it is the last record in a group
library(dplyr)
new_df %>%
group_by(id) %>%
arrange(int_start(inter1), .by_group = TRUE) %>%
mutate(overlap2 = lead(int_start(inter1)) < int_end(inter1))
# A tibble: 5 x 4
# Groups: id [2]
id inter1 overlap overlap2
<dbl> <Interval> <lgl> <lgl>
1 1 2017-06-27 09:30:00 UTC--2017-06-27 10:30:00 UTC FALSE FALSE
2 1 2017-06-27 14:30:00 UTC--2017-06-28 18:30:00 UTC TRUE TRUE
3 1 2017-06-27 15:30:00 UTC--2017-06-27 17:30:00 UTC TRUE NA
4 2 2017-06-28 09:30:00 UTC--2017-06-28 10:30:00 UTC FALSE FALSE
5 2 2017-06-28 15:00:00 UTC--2017-06-28 16:00:00 UTC FALSE NA
If needing to compare each row to all rows within the group
library(tidyverse)
new_df %>%
group_by(id) %>%
arrange(int_start(inter1), .by_group = TRUE) %>%
mutate(overlap2 = map_int(inter1, ~ sum(int_overlaps(.x, inter1))) > 1)
# A tibble: 5 x 4
# Groups: id [2]
id inter1 overlap overlap2
<dbl> <Interval> <lgl> <lgl>
1 1 2017-06-27 09:30:00 UTC--2017-06-27 10:30:00 UTC FALSE FALSE
2 1 2017-06-27 14:30:00 UTC--2017-06-28 18:30:00 UTC TRUE TRUE
3 1 2017-06-27 15:30:00 UTC--2017-06-27 17:30:00 UTC TRUE TRUE
4 2 2017-06-28 09:30:00 UTC--2017-06-28 10:30:00 UTC FALSE FALSE
5 2 2017-06-28 15:00:00 UTC--2017-06-28 16:00:00 UTC FALSE FALSE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With