I'd like to represent a set of integer ranges using Python where the set could be modified dynamically and tested for inclusion. Specifically I want to apply this to address ranges or line numbers in a file.
I could define the range of addresses I cared about to include:
200 - 400
450 - 470
700 - 900
Then I want to be able to add a potentially overlapping range to the set such that when I add 460 - 490
the set becomes:
200 - 400
450 - 490
700 - 900
But then be able to delete from the set where I could exclude the range 300 - 350
and the set becomes:
200 - 300
350 - 400
450 - 490
700 - 900
Finally, I want to be able to iterate over all integers included in the set, or test whether the set contains a particular value.
I'm wondering what the best way to do this is (particularly if there's something built into Python).
Traverse all the set of intervals and check whether the consecutive intervals overlaps or not. If the intervals(say interval a & interval b) doesn't overlap then the set of pairs form by [a. end, b. start] is the non-overlapping interval.
Sort the intervals by their end point, and then take each interval in end-point order, and put it into the bucket with the largest right-most end point, such that bucket. max_endpoint < interval. startpoint. If there is no such bucket, then you have to start a new one.
Let's take the following overlapping intervals example to explain the idea: If both ranges have at least one common point, then we say that they're overlapping. In other words, we say that two ranges and are overlapping if: On the other hand, non-overlapping ranges don't have any points in common.
You're describing an interval tree.
pip install intervaltree
Usage:
from intervaltree import IntervalTree, Interval
tree = IntervalTree()
tree[200:400] = True # or you can use ranges as the "values"
tree[450:470] = True
tree[700:900] = True
Querying:
>>> tree
IntervalTree([Interval(200, 400, True), Interval(450, 470, True), Interval(700, 900, True)])
>>> tree[250]
{Interval(200, 400, True)}
>>> tree[150]
set()
Adding overlapping range:
>>> tree[450:490] = True
>>> tree
IntervalTree([Interval(200, 400, True), Interval(450, 470, True), Interval(450, 490, True), Interval(700, 900, True)])
>>> tree.merge_overlaps()
>>> tree
IntervalTree([Interval(200, 400, True), Interval(450, 490), Interval(700, 900, True)])
Discarding:
>>> tree.chop(300, 350)
>>> tree
IntervalTree([Interval(200, 300, True), Interval(350, 400, True), Interval(450, 490), Interval(700, 900, True)])
I implemented a similar thing in a different language.
Basic ideas:
bisect
to search it in log time. This gives you the lookup / inclusion test.This should get you started.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With