Given a set of N intervals: For each interval, which other interval has a maximal overlap?
e.g. { [0,5], [2,9], [2,3], [4,9] } :
[0,5]: [2,9] (Overlap of 4)
[2,9]: [4,9] (Overlap of 6)
[2,3]: [0,5] or [2,9] (Overlap of 2)
[4,9]: [2,9] (Overlap of 6)
N can be large, so I believe an interval tree is necessary. However, no posts or publications I've found describe an approach to this type of query. The result of a query can lie on any of the 3 paths from an interval tree node (left of center, overlapping center, right of center), as they may or may not include the center point of the query interval. As such, I cannot think of a log(N) traversal method to obtain a result.
Also, for the [2,3] case, I don't care which is picked. Any maximally intersecting interval can be picked arbitrarily. Only 1 result is returned per query.
Can each of these queries be answered in log(N), providing an Nlog(N) overall solution?
EDIT: Pseudocode I worked out:
query(ITnode node, Interval intrv){
// s_list: start-sorted intervals overlapping center
// e_list: end-sorted intervals overlapping center
if intrv.end < node.center:
node_max = search node.s_list for interval w/ closest start to intrv.start
return max(node_max, query(node.left_tree, intrv))
else if intrv.start > node.center:
node_max = search node.e_list for interval w/ closest end to intrv.end
return max(node_max, query(node.right_tree, intrv))
else: // Query overlaps center
// Still working this out but I get the picture now
}
for each Interval i:
result[i.id] = query(root, i)
We can solve this problem using interval tree.
Algo:
Construct a set of points S - merge all left and right points of the intervals. For the input from example S = {0, 2, 3, 4, 5, 9}. Complexity - O(NlogN)
Build interval tree with the following structure
struct tree {
int max_overlap_value;
int max_overlap_id;
int second_max_overlap_value;
int second_max_overlap_id;
tree *left;
tree *right;
};
Also set max_overlap_value = second_max_overlap_value = 0, max_overlap_id = second_max_overlap_id -1. Complexity - O(N)
Update values in tree nodes using intervals from input. Complexity - O(NlogN)
For each interval find max_overlap_value, max_overlap_id, second_max_overlap_value, second_max_overlap_id. If max_overlap_id is equal to interval_id, use second_max_overlap_value, otherwise max_overlap_value. Complexity - O(NlogN)
Total complexity would be O(NlogN)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With