As title, say I am given a (n, 2) numpy array recording a series of segment's start and end indices, for example n=6:
import numpy as np
# x records the (start, end) index pairs corresponding to six segments
x = np.array(([0,4], # the 1st seg ranges from index 0 ~ 4
[5,9], # the 2nd seg ranges from index 5 ~ 9, etc.
[10,13],
[15,20],
[23,30],
[31,40]))
Now I want to combine those segments with small interval between them. For example, merge consecutive segments if the interval is no larger than 1, so desired output would be:
y = np.array([0,13], # Cuz the 1st seg's end is close to 2nd's start,
# and 2nd seg's end is close to 3rd's start, so are combined.
[15,20], # The 4th seg is away from the prior and posterior segs,
# so it remains untouched.
[23,40]) # The 5th and 6th segs are close, so are combined
so that the output segments would turn out to be just three instead of six. Any suggestion would be appreciated!
If we're able to assume the segments are ordered and none are wholly contained within a neighbor, then you could do this by identifying where the gap between the end value in one range and the start of the next exceeds your criteria:
start = x[1:, 0] # select columns, ignoring the beginning of the first range
end = x[:-1, 1] # and the end of the final range
mask = start>end+1 # identify where consecutive rows have too great a gap
Then stitching these pieces back together:
np.array([np.insert(start[mask], 0, x[0, 0]), np.append(end[mask], x[-1, -1])]).T
Out[96]:
array([[ 0, 13],
[15, 20],
[23, 40]])
Here's a NumPy vectorized solution -
def merge_boundaries(x):
mask = (x[1:,0] - x[:-1,1])!=1
idx = np.flatnonzero(mask)
start = np.r_[0,idx+1]
stop = np.r_[idx, x.shape[0]-1]
return np.c_[x[start,0], x[stop,1]]
Sample run -
In [230]: x
Out[230]:
array([[ 0, 4],
[ 5, 9],
[10, 13],
[15, 20],
[23, 30],
[31, 40]])
In [231]: merge_boundaries(x)
Out[231]:
array([[ 0, 13],
[15, 20],
[23, 40]])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With