Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Collapse rows with overlapping ranges

I have a data.frame with start and end time:

ranges<- data.frame(start = c(65.72000,65.72187, 65.94312,73.75625,89.61625),stop = c(79.72187,79.72375,79.94312,87.75625,104.94062))

> ranges
     start      stop
1 65.72000  79.72187
2 65.72187  79.72375
3 65.94312  79.94312
4 73.75625  87.75625
5 89.61625 104.94062

In this example, the ranges in row 2 and 3 are entirely within the range between 'start' on row 1 and stop on row 4. Thus, the overlapping ranges 1-4 should be collapsed to one range:

> ranges
     start      stop
1 65.72000  87.75625
5 89.61625 104.94062

I tried this:

mdat <- outer(ranges$start, ranges$stop, function(x,y) y > x)
mdat[upper.tri(mdat)|col(mdat)==row(mdat)] <- NA
mdat

And now I just need to figure out how to combine all the true ones, but not sure if it's the best way to go

like image 798
Liza Avatar asked Jan 19 '17 17:01

Liza


Video Answer


1 Answers

You can try this:

library(dplyr)
ranges %>% 
       arrange(start) %>% 
       group_by(g = cumsum(cummax(lag(stop, default = first(stop))) < start)) %>% 
       summarise(start = first(start), stop = max(stop))

# A tibble: 2 × 3
#      g    start      stop
#  <int>    <dbl>     <dbl>
#1     0 65.72000  87.75625
#2     1 89.61625 104.94062
like image 118
Psidom Avatar answered Sep 18 '22 21:09

Psidom