Currently, I have intervals of:
temp_tuple = [[-25, -14], [-21, -16], [-20, -15], [-10, -7], [-8, -5], [-6, -3], [2, 4], [2, 3], [3, 6], [12, 15], [13, 18], [14, 17], [22, 27], [25, 30], [26, 29]]
in an ascending order by the lower bound. My task is to merge overlapping intervals so that the outcome comes out to be:
[-25, -14]
[-10, -3]
[2, 6]
[12, 18]
[22, 30]
My first attempt involved deleting intervals that are completely within previous intervals, like [-21, -16] which falls within [-25, -14]. But deleting objects within a list kept interfering with the loop condition. My second attempt at deleting unnecessary intervals was:
i = 0
j = 1
while i < len(temp_tuples):
while j < len(temp_tuples):
if temp_tuples[i][1] > temp_tuples[j][1]:
del temp_tuples[j]
j += 1
i += 1
but this doesn't delete all the unnecessary intervals for some reason. What should I do?
A simple approach is to start from the first interval and compare it with all other intervals for overlapping, if it overlaps with any other interval, then remove the other interval from the list and merge the other into the first interval. Repeat the same steps for the remaining intervals after the first.
Let's take the following overlapping intervals example to explain the idea: If both ranges have at least one common point, then we say that they're overlapping. In other words, we say that two ranges and are overlapping if: On the other hand, non-overlapping ranges don't have any points in common.
pandas. Interval. overlaps() method is used to check whether the two intervals overlap and the result is returned.
If the intervals(say interval a & interval b) doesn't overlap then the set of pairs form by [a.end, b.start] is the non-overlapping interval. If the intervals overlaps, then check for next consecutive intervals.
It makes it a bit easier to process (as in think about) if you instead setup a new list. You additionally also get to keep your original data.
temp_tuple.sort(key=lambda interval: interval[0])
merged = [temp_tuple[0]]
for current in temp_tuple:
previous = merged[-1]
if current[0] <= previous[1]:
previous[1] = max(previous[1], current[1])
else:
merged.append(current)
If you now print(merged)
it would output:
[[-25, -14], [-10, -3], [2, 6], [12, 18], [22, 30]]
This is a numpy solution I came up with:
import numpy as np
def merge_intervals(intervals):
starts = intervals[:,0]
ends = np.maximum.accumulate(intervals[:,1])
valid = np.zeros(len(intervals) + 1, dtype=np.bool)
valid[0] = True
valid[-1] = True
valid[1:-1] = starts[1:] >= ends[:-1]
return np.vstack((starts[:][valid[:-1]], ends[:][valid[1:]])).T
#example
a=[]
a.append([1,3])
a.append([4,10])
a.append([5,12])
a.append([6,8])
a.append([20,33])
a.append([30,35])
b = np.array(a)
print("intervals")
print(b)
print()
print("merged intervals")
print(merge_intervals(b))
Output:
intervals
[[ 1 3]
[ 4 10]
[ 5 12]
[ 6 8]
[20 33]
[30 35]]
merged intervals
[[ 1 3]
[ 4 12]
[20 35]]
Please note that the input array must be sorted by start positions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With