Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I do this with Boost interval_map?

What I want to do is handling interval efficiently. For example, in my example, intervals are like the following:

[10, 20], [15, 25], [40, 100], [5, 14]

Intervals are closed and integers, and some of intervals may be ovelapped. I want to find overlapped intervals for a given query efficiently. For example, if [16, 22] is given:

[10, 20], [15, 25]

The above intervals should be computed as overalpped intervals.

I'm currently writing an interval tree based on Red-Black Tree (reference: CLRS, Introduction to Algorithms). Although finding all overlapped intervals can be O(n), the running time should be faster. Note that intervals can be deleted and inserted.


However, I just found that Boost has interval_map and interval_set: http://www.boost.org/doc/libs/1_46_1/libs/icl/doc/html/index.html

I tried it, but the behavior is very strange for me. For example, if [2, 7] is inserted first and then [3, 8] is inserted, then the resulting map will have [2, 3), [3, 7], and (7, 8]. That is, when a new interval is inserted, splitting is automatically done.

Can I turn off this feature? Or, Boost's interval_map is right for my purpose?

like image 499
Nullptr Avatar asked Nov 02 '11 02:11

Nullptr


2 Answers

You asked for a data structure that could find overlaps efficiently. This does so, by storing overlaps in the data structure. Now you seem to be complaining that it has done so.

This example explains the logic:

typedef std::set<string> guests;
interval_map<time, guests> party;
party += make_pair(interval<time>::right_open(time("20:00"), time("22:00")),
guests("Mary"));
party += make_pair(interval<time>::right_open(time("21:00"), time("23:00")),
guests("Harry")); 
// party now contains
[20:00, 21:00)->{"Mary"} 
[21:00, 22:00)->{"Harry","Mary"} //guest sets aggregated on overlap
[22:00, 23:00)->{"Harry"}

When you add two overlapping intervals, you actually create three intervals with distinct properties. The overlap is in both original intervals, make it a logically distinct interval from either of the original intervals. And the two original intervals now span times with different properties (some that overlap the original, some that don't). This splitting makes it efficient to find overlaps, since they are their own intervals in the map.

In any event, Boost does allow you to select the interval combining style. So if you want to force a structure that makes it harder to find overlaps, you can do so.

like image 51
David Schwartz Avatar answered Sep 21 '22 11:09

David Schwartz


I tried boost interval_map and interval_set. They are very inefficient. The setup cost is very high because the implementation basically maps each subinterval (intersection) to all the intervals that contain it.

I think the implementation in CLRS "introduction to algorithms" based on red-black tree is far better. It is strange there is no red-black tree implementation out there that allows augmentation, even though std::set and std::map are based on RB tree.

like image 31
user723145 Avatar answered Sep 20 '22 11:09

user723145