I'd like to represent a set of integer ranges using Python where the set could be modified dynamically and tested for inclusion. Specifically I want to apply this to address ranges or line numbers in a file.
I could define the range of addresses I cared about to include:
200 - 400  
450 - 470  
700 - 900  
Then I want to be able to add a potentially overlapping range to the set such that when I add 460 - 490 the set becomes:
200 - 400  
450 - 490  
700 - 900  
But then be able to delete from the set where I could exclude the range 300 - 350 and the set becomes:
200 - 300
350 - 400  
450 - 490  
700 - 900  
Finally, I want to be able to iterate over all integers included in the set, or test whether the set contains a particular value.
I'm wondering what the best way to do this is (particularly if there's something built into Python).
Traverse all the set of intervals and check whether the consecutive intervals overlaps or not. If the intervals(say interval a & interval b) doesn't overlap then the set of pairs form by [a. end, b. start] is the non-overlapping interval.
Sort the intervals by their end point, and then take each interval in end-point order, and put it into the bucket with the largest right-most end point, such that bucket. max_endpoint < interval. startpoint. If there is no such bucket, then you have to start a new one.
Let's take the following overlapping intervals example to explain the idea: If both ranges have at least one common point, then we say that they're overlapping. In other words, we say that two ranges and are overlapping if: On the other hand, non-overlapping ranges don't have any points in common.
You're describing an interval tree.
pip install intervaltree
Usage:
from intervaltree import IntervalTree, Interval
tree = IntervalTree()
tree[200:400] = True  # or you can use ranges as the "values"
tree[450:470] = True
tree[700:900] = True
Querying:
>>> tree
IntervalTree([Interval(200, 400, True), Interval(450, 470, True), Interval(700, 900, True)])
>>> tree[250]
{Interval(200, 400, True)}
>>> tree[150]
set()
Adding overlapping range:
>>> tree[450:490] = True
>>> tree
IntervalTree([Interval(200, 400, True), Interval(450, 470, True), Interval(450, 490, True), Interval(700, 900, True)])
>>> tree.merge_overlaps()
>>> tree
IntervalTree([Interval(200, 400, True), Interval(450, 490), Interval(700, 900, True)])
Discarding:
>>> tree.chop(300, 350)
>>> tree
IntervalTree([Interval(200, 300, True), Interval(350, 400, True), Interval(450, 490), Interval(700, 900, True)])
                        I implemented a similar thing in a different language.
Basic ideas:
bisect to search it in log time. This gives you the lookup / inclusion test.This should get you started.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With