Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to divide a set of overlapping ranges into non-overlapping ranges?

Let's say you have a set of ranges:

  • 0 - 100: 'a'
  • 0 - 75: 'b'
  • 95 - 150: 'c'
  • 120 - 130: 'd'

Obviously, these ranges overlap at certain points. How would you dissect these ranges to produce a list of non-overlapping ranges, while retaining information associated with their original range (in this case, the letter after the range)?

For example, the results of the above after running the algorithm would be:

  • 0 - 75: 'a', 'b'
  • 76 - 94: 'a'
  • 95 - 100: 'a', 'c'
  • 101 - 119: 'c'
  • 120 - 130: 'c', 'd'
  • 131 - 150: 'c'
like image 539
Ron Eggbertson Avatar asked Mar 10 '09 03:03

Ron Eggbertson


People also ask

How to fix overlapping intervals?

Sort the given list of time intervals in ascending order of starting time. Then, push the first time interval in the stack and compare the next interval with the one in the stack. If it's overlapping, then merge them into one interval; otherwise, push it in the stack.

How do you calculate range overlap?

Calculate total cost based on different rates per hourCount overlapping hours for the first range 00:00-08:00 and multiply with rate 8. returns 16. Count overlapping hours for the second range 08:00-18:00 and multiply with rate 5. Count overlapping hours for the second range 18:00-24:00 and multiply with rate 10.

What is non-overlapping intervals?

If the intervals(say interval a & interval b) doesn't overlap then the set of pairs form by [a. end, b. start] is the non-overlapping interval. If the intervals overlaps, then check for next consecutive intervals.

What are overlapping ranges?

If both ranges have at least one common point, then we say that they're overlapping. In other words, we say that two ranges and are overlapping if: On the other hand, non-overlapping ranges don't have any points in common.


2 Answers

I had the same question when writing a program to mix (partly overlapping) audio samples.

What I did was add an "start event" and "stop event" (for each item) to a list, sort the list by time point, and then process it in order. You could do the same, except using an integer point instead of a time, and instead of mixing sounds you'd be adding symbols to the set corresponding to a range. Whether you'd generate empty ranges or just omit them would be optional.

Edit Perhaps some code...

# input = list of (start, stop, symbol) tuples
points = [] # list of (offset, plus/minus, symbol) tuples
for start,stop,symbol in input:
    points.append((start,'+',symbol))
    points.append((stop,'-',symbol))
points.sort()

ranges = [] # output list of (start, stop, symbol_set) tuples
current_set = set()
last_start = None
for offset,pm,symbol in points:
    if pm == '+':
         if last_start is not None:
             #TODO avoid outputting empty or trivial ranges
             ranges.append((last_start,offset-1,current_set))
         current_set.add(symbol)
         last_start = offset
    elif pm == '-':
         # Getting a minus without a last_start is unpossible here, so not handled
         ranges.append((last_start,offset-1,current_set))
         current_set.remove(symbol)
         last_start = offset

# Finish off
if last_start is not None:
    ranges.append((last_start,offset-1,current_set))

Totally untested, obviously.

like image 173
Edmund Avatar answered Oct 06 '22 00:10

Edmund


A similar answer to Edmunds, tested, including support for intervals like (1,1):

class MultiSet(object):
    def __init__(self, intervals):
        self.intervals = intervals
        self.events = None

    def split_ranges(self):
        self.events = []
        for start, stop, symbol in self.intervals:
            self.events.append((start, True, stop, symbol))
            self.events.append((stop, False, start, symbol))

        def event_key(event):
            key_endpoint, key_is_start, key_other, _ = event
            key_order = 0 if key_is_start else 1
            return key_endpoint, key_order, key_other

        self.events.sort(key=event_key)

        current_set = set()
        ranges = []
        current_start = -1

        for endpoint, is_start, other, symbol in self.events:
            if is_start:
                if current_start != -1 and endpoint != current_start and \
                       endpoint - 1 >= current_start and current_set:
                    ranges.append((current_start, endpoint - 1, current_set.copy()))
                current_start = endpoint
                current_set.add(symbol)
            else:
                if current_start != -1 and endpoint >= current_start and current_set:
                    ranges.append((current_start, endpoint, current_set.copy()))
                current_set.remove(symbol)
                current_start = endpoint + 1

        return ranges


if __name__ == '__main__':
    intervals = [
        (0, 100, 'a'), (0, 75, 'b'), (75, 80, 'd'), (95, 150, 'c'), 
        (120, 130, 'd'), (160, 175, 'e'), (165, 180, 'a')
    ]
    multiset = MultiSet(intervals)
    pprint.pprint(multiset.split_ranges())


[(0, 74, {'b', 'a'}),
 (75, 75, {'d', 'b', 'a'}),
 (76, 80, {'d', 'a'}),
 (81, 94, {'a'}),
 (95, 100, {'c', 'a'}),
 (101, 119, {'c'}),
 (120, 130, {'d', 'c'}),
 (131, 150, {'c'}),
 (160, 164, {'e'}),
 (165, 175, {'e', 'a'}),
 (176, 180, {'a'})]
like image 34
sourcedelica Avatar answered Oct 06 '22 01:10

sourcedelica