I am trying to solve the following problem: given N time intervals, each specified as (start, end), non-overlapping, sorted based on start - find an interval that contains a given date. For instance given:
[1,4] [5,8] [9,10][11,20]
3 falls into the first interval, 15 into fourth etc.
So far I had the following base ideas:
I was thinking whether there is a scope to combine the two approaches. One other idea is to sort based on the duration and insert all the intervals into a tree, with comparison by start date, This, in the worst case when longest durations are in chronological order, this approach is equal in performance to 2.
The ideal solution I imagined would be to have a tree (or some similar data structure) that would contain the longest interval on the top, then the two branches would have the next two longest intervals etc. However, I see no way to branch in that tree i.e. since we make an explicit assumption that we insert based on the length, we cannot really discard left or right side of the tree.
Any comments would be greatly appreciated.
A naive combination of the two approaches offer a O(1)
in most cases, and O(logN)
worst case, but the hidden constant in the big O notation will be double:
intervals
lengths
intervals
using binary search, and in lengths
using linear search. The search will be done in parallel1, and the first that finishes up - will cause the other one to stop as well.Because in step 3, we may do up to c*log(n)
steps of binary search in intervals
, we will do up to 2*c*log(n)
steps overall. If we find the element faster in lengths
, it will cause the binary search to end in the middle, and we will get the reduced number of operations, (but with a double constant then the original approach).
(1) No parallel computer is needed, it can be achieved by searching one step in each, simulating parallel computing on a single thread, until an answer is found. (General information, not needed for understanding the answer: The concept of "searching in parallel" was introduced by S.Even, A.Itai and A.Shamir in their article On the Complexity of TimeTable and Multi Commodity Flow Problems)
You could prioritise certain nodes in binary search tree and still have it balanced:
Start with your combined approach (place longer intervals closer to the top) and employ a series of rotations to make the binary search tree balanced (while breaking some prioritisation, it should preserve high-priority nodes close to the root if they don't break the balance).
To balance a tree, try this strategy (untested):
This should produce an AVL-balanced tree that respects the original priorities where possible.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With