Identify all overlapping tuples in list

Question

I've currently got a list of tuples (though I control creation of the list and tuples, so they could be altered in type if needed). Each tuple has a start and end integer and a string with an ID for that range's source. What I'm trying to do is identify all of the overlapping ranges within the tuples.

Currently I have

a = [(0, 98, '122:R'), 
     (100, 210, '124:R'),
     (180, 398, '125:R'),
     (200, 298, '123:R')]
highNum = 0
highNumItem = ''
for item in a:
    if item[0] < highNum:
        print(highNumItem + ' overlaps ' + item[2])
        if item[1] > highNum:
            highNum = item[1]
            highNumItem = item[2]

    
# 124:R overlaps 125:R
# 125:R overlaps 123:R

Which outputs enough information that overlaps should be able to be manually review and fixed. But, it misses identifying some sets of overlaps. I can't help thinking there's a relatively obvious solution I'm just missing or not using the right search terms to find examples of. But ideally I'd like the output to actually be

124:R overlaps 125:R & 123:R
125:R overlaps 123:R

But using my comparison method, I can't see a way to catch the rare instance where an overlap spans more than just 2 adjacent ranges. If anyone could point me to a function or comparison method appropriate to this, I'd greatly appreciate.

Also, if it matters, I'm currently stuck with python 2.7, but need to be able to port solution to 3.x when 3rd party applications allow it.

Elan-R · Accepted Answer

Checks the higher number of a pair with the lower number of another pair:

a = [(0, 98, '122:R'), (100, 210, '124:R'), (180, 398, '125:R'), (200, 298, '123:R')]

for i, base_data in enumerate(a):
    for check_data in a[i + 1:]:
        if base_data[1] > check_data[0]:
            print(f"{base_data[2]} overlaps {check_data[2]}")
#prints:
#124:R overlaps 125:R
#124:R overlaps 123:R
#125:R overlaps 123:R

If you want it stored in groups:

from collections import defaultdict

a = [(0, 98, '122:R'), (100, 210, '124:R'), (180, 398, '125:R'), (200, 298, '123:R')]
d = defaultdict(list)

for i, base_data in enumerate(a):
    for check_data in a[i + 1:]:
        if base_data[1] > check_data[0]:
            d[base_data[2]].append(check_data[2])


print(d)
#prints:
#defaultdict(<class 'list'>, {'124:R': ['125:R', '123:R'], '125:R': ['123:R']})
#but can *easily* be iterated over to pretty print:
print("
".join([f'{key} overlaps {" & ".join(d[key])}' for key in d]))
#prints:
#124:R overlaps 125:R & 123:R
#125:R overlaps 123:R

The dictionary is far better than printing as it actually stores the data instead of just printing it. Printing the data can also be more controlled. Also, using a defaultdict over a dict makes the code more compact.

Chris Charley · Answer

Here is an example using intspan to calculate the overlaps. (Using Python 3.8)

from itertools import combinations
from intspan import intspan

a = [(0, 98, '122:R'), (100, 210, '124:R'), (180, 398, '125:R'), (200, 298, '123:R')]

d = {}
for one, two in combinations(a,2):
    # if the 2 ranges intersect
    if intspan.from_range(*one[0:2]) & intspan.from_range(*two[0:2]):
        d.setdefault(one[2], []).append(two[2])

for key, v in d.items():
    print(key + ',' + ','.join(v))

Prints:

124:R,125:R,123:R
125:R,123:R

Identify all overlapping tuples in list

Tags:

python

John

2 Answers

Elan-R

Chris Charley

Recent Activity

Donate For Us

Identify all overlapping tuples in list

Tags:

python

John

2 Answers

Elan-R

Chris Charley

Related questions

Recent Activity

Donate For Us