Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python representation for a set of non-overlapping integer ranges

Tags:

python

I'd like to represent a set of integer ranges using Python where the set could be modified dynamically and tested for inclusion. Specifically I want to apply this to address ranges or line numbers in a file.

I could define the range of addresses I cared about to include:

200 - 400  
450 - 470  
700 - 900  

Then I want to be able to add a potentially overlapping range to the set such that when I add 460 - 490 the set becomes:

200 - 400  
450 - 490  
700 - 900  

But then be able to delete from the set where I could exclude the range 300 - 350 and the set becomes:

200 - 300
350 - 400  
450 - 490  
700 - 900  

Finally, I want to be able to iterate over all integers included in the set, or test whether the set contains a particular value.

I'm wondering what the best way to do this is (particularly if there's something built into Python).

like image 968
user8472 Avatar asked May 29 '18 21:05

user8472


People also ask

How do you find non-overlapping intervals?

Traverse all the set of intervals and check whether the consecutive intervals overlaps or not. If the intervals(say interval a & interval b) doesn't overlap then the set of pairs form by [a. end, b. start] is the non-overlapping interval.

How do you sort intervals into minimum number of bins without overlapping intervals?

Sort the intervals by their end point, and then take each interval in end-point order, and put it into the bucket with the largest right-most end point, such that bucket. max_endpoint < interval. startpoint. If there is no such bucket, then you have to start a new one.

What are overlapping intervals?

Let's take the following overlapping intervals example to explain the idea: If both ranges have at least one common point, then we say that they're overlapping. In other words, we say that two ranges and are overlapping if: On the other hand, non-overlapping ranges don't have any points in common.


2 Answers

You're describing an interval tree.

pip install intervaltree

Usage:

from intervaltree import IntervalTree, Interval
tree = IntervalTree()
tree[200:400] = True  # or you can use ranges as the "values"
tree[450:470] = True
tree[700:900] = True

Querying:

>>> tree
IntervalTree([Interval(200, 400, True), Interval(450, 470, True), Interval(700, 900, True)])
>>> tree[250]
{Interval(200, 400, True)}
>>> tree[150]
set()

Adding overlapping range:

>>> tree[450:490] = True
>>> tree
IntervalTree([Interval(200, 400, True), Interval(450, 470, True), Interval(450, 490, True), Interval(700, 900, True)])
>>> tree.merge_overlaps()
>>> tree
IntervalTree([Interval(200, 400, True), Interval(450, 490), Interval(700, 900, True)])

Discarding:

>>> tree.chop(300, 350)
>>> tree
IntervalTree([Interval(200, 300, True), Interval(350, 400, True), Interval(450, 490), Interval(700, 900, True)])
like image 50
wim Avatar answered Nov 13 '22 14:11

wim


I implemented a similar thing in a different language.

Basic ideas:

  • Keep a tree of ranges ordered by the left bound. Instead of a real tree, you can keep a Python list and use bisect to search it in log time. This gives you the lookup / inclusion test.
  • Represent all operations as sub-range operations. A single element operation just works on a sub-range of length 1 internally.
  • Implement the basic sub-range operations: addition, which is simple, and subtraction, which may end up with two sub-ranges if a middle is excluded, or one subrange, possibly empty.
  • After an addition, check the immediate neighbor subranges, and maybe merge with one or both of them, if the subranges intersect. Continue in both directions until no merge operations occur.

This should get you started.

like image 21
9000 Avatar answered Nov 13 '22 16:11

9000