Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Given a set of intervals, find the interval which has the maximum number of intersections

Given a set of intervals, find the interval which has the maximum number of intersections (not the length of a particular intersection). So if input (1,6) (2,3) (4,11), (1,6) should be returned. Some suggest to use Interval Tree to get this done in O(nlogn), but I did not understand how to construct and use the Interval Tree after reading its wiki page. I believe it can be done by doing some sort of sorting and scanning algorithm. If Interval tree is the only option, please educate me how to construct/use one. Thanks.

like image 317
user3216886 Avatar asked Feb 23 '14 10:02

user3216886


People also ask

How do you find the maximum intersection?

Naive Approach: The simplest approach to solve the given problem is to iterate over all segments and for each segment count the number of intersections by checking it with all other segments and then print the maximum among all the count of intersections obtained.

Is intersection of intervals an interval?

It is not hard to prove that (a,b) µ I µ [a,b] (do it!). From this, conclude that I is indeed given by one of the four examples above (note that a and b may be ±1). Intersections: Any intersection of intervals is an interval. Proof.


3 Answers

Don't use an interval tree. Create an event for each interval endpoint, so that each interval has a start event and a stop event. Process these events in order. (Order starts before stops if a measure-zero intersection counts as an intersection, or stops before starts otherwise.)

Initialize a map C from intervals to numbers. Initialize the start count s = 0 and the stop count t = 0. To process a start event for interval i, set s = s + 1 and then C(i) = -(t + 1). To process a stop event for interval i, set t = t + 1 and then C(i) = C(i) + s.

At the end, C maps intervals to their intersection counts. This algorithm is O(n log n) because of the sorting; that running time is optimal if the endpoints can be added and compared only, by fairly standard computational geometry techniques.

like image 190
David Eisenstat Avatar answered Oct 19 '22 18:10

David Eisenstat


Note: David Eisenstat's algorithm has better performance than this one.

A simple plane-sweep algorithm will do this in O(nlogn + m*n), where m is the maximum number of intersections with any single interval.

Sort the interval endpoints. Keep track of the active segments. When reaching a start point, increment the intersection counts of all active intervals, and set the new interval's intersection count to the number of active intervals (excluding itself). When reaching an end point, remove the interval from the active intervals.

like image 45
Sneftel Avatar answered Oct 19 '22 18:10

Sneftel


A python implementation of the David's approach,

def max_interval_count(intervals):
    interval_sorted = []
    for idx, interval in enumerate(intervals):
        s, e = interval
        interval_sorted.append([s, idx, 0]) # 0 for start
        interval_sorted.append([e, idx, 1]) # 1 for end
    interval_sorted.sort(key = lambda x: x[0])

    print(interval_sorted)

    number_of_starts = 0
    number_of_ends = 0

    overlap_count = {} 
    for event in interval_sorted:
        _, idx, start_end = event
        if start_end == 0: # start event
            # subtract all the ending before it
            overlap_count[idx] = - (number_of_ends)
            number_of_starts += 1
        else: # end event
            overlap_count[idx] += (number_of_starts - 1) # -1 as we should not include the start from the same interval
            number_of_ends += 1
    print(overlap_count)
    ans_idx = -1
    max_over_count = 0
    min_len_interval = 99999999999
    for idx, overl_cnt in overlap_count.items():
        if overl_cnt > max_over_count:
            ans_idx = idx
            max_over_count = overl_cnt
        elif overl_cnt == max_over_count and overl_cnt > 0 and (intervals[idx][1] - intervals[idx][0] + 1) < min_len_interval:
            min_len_interval = (intervals[idx][1] - intervals[idx][0] + 1)
            ans_idx = idx
    if ans_idx == -1:
        return ans_idx
    return intervals[ans_idx]


if __name__ == "__main__":
    test_1 = [[1,5],[5,10],[5,5]]
    test_2 = [[1,2],[3,5]]
    test_3 = [(1,6), (2,3), (4,11)]
    ans = max_interval_count(test_1)
    print(ans)
    print("---------")
    ans = max_interval_count(test_2)
    print(ans)
    print("---------")
    ans = max_interval_count(test_3)
    print(ans)
    print("---------")
[[1, 0, 0], [5, 0, 1], [5, 1, 0], [5, 2, 0], [5, 2, 1], [10, 1, 1]]
{0: 0, 1: 1, 2: 1}
[5, 5]
---------
[[1, 0, 0], [2, 0, 1], [3, 1, 0], [5, 1, 1]]
{0: 0, 1: 0}
-1
---------
[[1, 0, 0], [2, 1, 0], [3, 1, 1], [4, 2, 0], [6, 0, 1], [11, 2, 1]]
{0: 2, 1: 1, 2: 1}
(1, 6)
---------
like image 41
Zabir Al Nazi Avatar answered Oct 19 '22 18:10

Zabir Al Nazi