Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merging Overlapping Intervals

Tags:

python

Currently, I have intervals of:

temp_tuple = [[-25, -14], [-21, -16], [-20, -15], [-10, -7], [-8, -5], [-6, -3], [2, 4], [2, 3], [3, 6], [12, 15], [13, 18], [14, 17], [22, 27], [25, 30], [26, 29]]

in an ascending order by the lower bound. My task is to merge overlapping intervals so that the outcome comes out to be:

[-25, -14]
[-10, -3]
[2, 6]
[12, 18]
[22, 30]

My first attempt involved deleting intervals that are completely within previous intervals, like [-21, -16] which falls within [-25, -14]. But deleting objects within a list kept interfering with the loop condition. My second attempt at deleting unnecessary intervals was:

i = 0
j = 1
while i < len(temp_tuples):
    while j < len(temp_tuples):
        if temp_tuples[i][1] > temp_tuples[j][1]:
            del temp_tuples[j]
        j += 1
    i += 1

but this doesn't delete all the unnecessary intervals for some reason. What should I do?

like image 494
Joon Choi Avatar asked Apr 25 '17 03:04

Joon Choi


People also ask

How do you combine overlapping intervals?

A simple approach is to start from the first interval and compare it with all other intervals for overlapping, if it overlaps with any other interval, then remove the other interval from the list and merge the other into the first interval. Repeat the same steps for the remaining intervals after the first.

What is an overlapping interval?

Let's take the following overlapping intervals example to explain the idea: If both ranges have at least one common point, then we say that they're overlapping. In other words, we say that two ranges and are overlapping if: On the other hand, non-overlapping ranges don't have any points in common.

How does Python determine overlapping intervals?

pandas. Interval. overlaps() method is used to check whether the two intervals overlap and the result is returned.

What is non-overlapping interval?

If the intervals(say interval a & interval b) doesn't overlap then the set of pairs form by [a.end, b.start] is the non-overlapping interval. If the intervals overlaps, then check for next consecutive intervals.


2 Answers

It makes it a bit easier to process (as in think about) if you instead setup a new list. You additionally also get to keep your original data.

temp_tuple.sort(key=lambda interval: interval[0])
merged = [temp_tuple[0]]
for current in temp_tuple:
    previous = merged[-1]
    if current[0] <= previous[1]:
        previous[1] = max(previous[1], current[1])
    else:
        merged.append(current)

If you now print(merged) it would output:

[[-25, -14], [-10, -3], [2, 6], [12, 18], [22, 30]]
like image 138
vallentin Avatar answered Oct 16 '22 14:10

vallentin


This is a numpy solution I came up with:

import numpy as np

def merge_intervals(intervals):
    starts = intervals[:,0]
    ends = np.maximum.accumulate(intervals[:,1])
    valid = np.zeros(len(intervals) + 1, dtype=np.bool)
    valid[0] = True
    valid[-1] = True
    valid[1:-1] = starts[1:] >= ends[:-1]
    return np.vstack((starts[:][valid[:-1]], ends[:][valid[1:]])).T

#example
a=[]
a.append([1,3])
a.append([4,10])
a.append([5,12])
a.append([6,8])
a.append([20,33])
a.append([30,35])

b = np.array(a)

print("intervals")
print(b)
print()
print("merged intervals")
print(merge_intervals(b))

Output:

intervals
[[ 1  3]
 [ 4 10]
 [ 5 12]
 [ 6  8]
 [20 33]
 [30 35]]

merged intervals
[[ 1  3]
 [ 4 12]
 [20 35]]

Please note that the input array must be sorted by start positions.

like image 26
Christopher Schröder Avatar answered Oct 16 '22 16:10

Christopher Schröder