Let's say I have the following number ranges:
0-500 0-100 75-127 125-157 130-198 198-200
Now, let's say I need to be able to check any given number and see which ranges it is in. What sort of data structure would be used most efficiently to tell that, say, the number 100 belongs to the ranges 0-500, 0-100, and 75-127? Do I just want a binary tree holding the starting values? In that case, would it be feasible for each node in the tree to hold several objects containing each range at that starting point?
Note that I only need to retrieve for this particular application, I don't really see myself needing to modify it mid-process, so retrieval speed is by far my priority.
Thanks!
1) Sort all intervals in increasing order of start time. This step takes O(nLogn) time. 2) In the sorted array, if start time of an interval is less than end of previous interval, then there is an overlap.
In computer science, an interval tree is a tree data structure to hold intervals. Specifically, it allows one to efficiently find all intervals that overlap with any given interval or point.
In data structures, a range query consists of preprocessing some input data into a data structure to efficiently answer any number of queries on any subset of the input.
An interval tree has a leaf node for every elementary interval. On top of these leaves is built a complete binary tree. Each internal node of the tree stores, as its key, the integer that separates the elementary intervals in its left and right subrees.
What you need is interval tree. Interval trees are quite general concept and keep a slightly different data in nodes for every problem. In your case each node would keep a list of input intervals that cover the interval node is representing.
Let R
denote the number of ranges possible. (For your example R=6
)
Create R
hashtables such that each hash table can hold only one range. For your example you need to create 6 hashtables. The first hash table R1
will contain values from 0-500 only.
Populate hashtable.
Each number will go into appropriate hashtables. For your example number 100
will go into R1
, R2
, R3
. If R
is large, then you would need to create lot of hashtables. But then the total space is bounded by the actual data stored in all the hashtables put together.
Retrival:
For any given number check if it is present in each of the R
hashtables. You can further optimize by choosing which hashtables to look. For example for 100
you only need to look into 3 out of 6 hashtables.
Time complexity:
Searching for a single value in a hash table takes constant time
(on average). So
Amortized O(1)
to see if a number is present in a hashtable.
Amortized O(R)
to produce the output as we need to look at all the hashtables to produce the output
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With