How to divide a set of overlapping ranges into non-overlapping ranges?

Tags:

Let's say you have a set of ranges:

0 - 100: 'a'
0 - 75: 'b'
95 - 150: 'c'
120 - 130: 'd'

Obviously, these ranges overlap at certain points. How would you dissect these ranges to produce a list of non-overlapping ranges, while retaining information associated with their original range (in this case, the letter after the range)?

For example, the results of the above after running the algorithm would be:

0 - 75: 'a', 'b'
76 - 94: 'a'
95 - 100: 'a', 'c'
101 - 119: 'c'
120 - 130: 'c', 'd'
131 - 150: 'c'

539

asked Mar 10 '09 03:03

Ron Eggbertson

2 Answers

I had the same question when writing a program to mix (partly overlapping) audio samples.

What I did was add an "start event" and "stop event" (for each item) to a list, sort the list by time point, and then process it in order. You could do the same, except using an integer point instead of a time, and instead of mixing sounds you'd be adding symbols to the set corresponding to a range. Whether you'd generate empty ranges or just omit them would be optional.

Edit Perhaps some code...

# input = list of (start, stop, symbol) tuples
points = [] # list of (offset, plus/minus, symbol) tuples
for start,stop,symbol in input:
    points.append((start,'+',symbol))
    points.append((stop,'-',symbol))
points.sort()

ranges = [] # output list of (start, stop, symbol_set) tuples
current_set = set()
last_start = None
for offset,pm,symbol in points:
    if pm == '+':
         if last_start is not None:
             #TODO avoid outputting empty or trivial ranges
             ranges.append((last_start,offset-1,current_set))
         current_set.add(symbol)
         last_start = offset
    elif pm == '-':
         # Getting a minus without a last_start is unpossible here, so not handled
         ranges.append((last_start,offset-1,current_set))
         current_set.remove(symbol)
         last_start = offset

# Finish off
if last_start is not None:
    ranges.append((last_start,offset-1,current_set))

Totally untested, obviously.

173

answered Oct 06 '22 00:10

Edmund

A similar answer to Edmunds, tested, including support for intervals like (1,1):

class MultiSet(object):
    def __init__(self, intervals):
        self.intervals = intervals
        self.events = None

    def split_ranges(self):
        self.events = []
        for start, stop, symbol in self.intervals:
            self.events.append((start, True, stop, symbol))
            self.events.append((stop, False, start, symbol))

        def event_key(event):
            key_endpoint, key_is_start, key_other, _ = event
            key_order = 0 if key_is_start else 1
            return key_endpoint, key_order, key_other

        self.events.sort(key=event_key)

        current_set = set()
        ranges = []
        current_start = -1

        for endpoint, is_start, other, symbol in self.events:
            if is_start:
                if current_start != -1 and endpoint != current_start and \
                       endpoint - 1 >= current_start and current_set:
                    ranges.append((current_start, endpoint - 1, current_set.copy()))
                current_start = endpoint
                current_set.add(symbol)
            else:
                if current_start != -1 and endpoint >= current_start and current_set:
                    ranges.append((current_start, endpoint, current_set.copy()))
                current_set.remove(symbol)
                current_start = endpoint + 1

        return ranges


if __name__ == '__main__':
    intervals = [
        (0, 100, 'a'), (0, 75, 'b'), (75, 80, 'd'), (95, 150, 'c'), 
        (120, 130, 'd'), (160, 175, 'e'), (165, 180, 'a')
    ]
    multiset = MultiSet(intervals)
    pprint.pprint(multiset.split_ranges())


[(0, 74, {'b', 'a'}),
 (75, 75, {'d', 'b', 'a'}),
 (76, 80, {'d', 'a'}),
 (81, 94, {'a'}),
 (95, 100, {'c', 'a'}),
 (101, 119, {'c'}),
 (120, 130, {'d', 'c'}),
 (131, 150, {'c'}),
 (160, 164, {'e'}),
 (165, 175, {'e', 'a'}),
 (176, 180, {'a'})]

answered Oct 06 '22 01:10

sourcedelica

Related questions
                            
                                Can I use functions imported from .py files in Dask/Distributed?
                            
                                coloring cells in excel with pandas
                            
                                How to store the result from %%timeit cell magic?
                            
                                Keras showing images from data generator
                            
                                randomly remove rows from dataframe based on condition
                            
                                Why does 000 evaluate to 0 in Python 3? [duplicate]
                            
                                What are the causes of overflow encountered in double_scalars besides division by zero?
                            
                                Feature preprocessing of both continuous and categorical variables (of integer type) with scikit-learn
                            
                                pandas or python equivalent of tidyr complete
                            
                                How to compute AIC for linear regression model in Python?
                            
                                How to Cut an Image Vertically into Two Equal Sized Images
                            
                                Python-3 and \x Vs \u Vs \U in string encoding and why
                            
                                python - iterating list of dictionaries and unpacking
                            
                                Merge two dataframes with multi-index
                            
                                Python's `unittest` lacks an `assertHasAttr` method, what should I use instead?
                            
                                Vectorized groupby with NumPy
                            
                                Zappa deployment error : GET request yields 502 response code
                            
                                partial asynchronous functions are not detected as asynchronous
                            
                                OpenGl with Python
                            
                                How do I validate the MX record for a domain in python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to divide a set of overlapping ranges into non-overlapping ranges?

Tags:

python

algorithm

math

range

rectangles

Ron Eggbertson

People also ask

2 Answers

Edmund

sourcedelica

Recent Activity

Donate For Us