Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find interval containing given time instant

I am trying to solve the following problem: given N time intervals, each specified as (start, end), non-overlapping, sorted based on start - find an interval that contains a given date. For instance given:

[1,4] [5,8] [9,10][11,20]

3 falls into the first interval, 15 into fourth etc.

So far I had the following base ideas:

  1. We can use binary seach to find the corresponding interval (Log N)
  2. Since it might be the case that only a few intervals are big, and the rest small, it might be worthwhile sorting the itervals based on their duration. Then, statistically, most of the time we would 'hit' the longest intervals (O(1)), only sometimes this would result in the worst case complexity of N.

I was thinking whether there is a scope to combine the two approaches. One other idea is to sort based on the duration and insert all the intervals into a tree, with comparison by start date, This, in the worst case when longest durations are in chronological order, this approach is equal in performance to 2.

The ideal solution I imagined would be to have a tree (or some similar data structure) that would contain the longest interval on the top, then the two branches would have the next two longest intervals etc. However, I see no way to branch in that tree i.e. since we make an explicit assumption that we insert based on the length, we cannot really discard left or right side of the tree.

Any comments would be greatly appreciated.

like image 543
Bober02 Avatar asked Dec 24 '12 09:12

Bober02


2 Answers

A naive combination of the two approaches offer a O(1) in most cases, and O(logN) worst case, but the hidden constant in the big O notation will be double:

  1. Let the original array be intervals
  2. Create a second array, as suggested sorted by lengths of intervals descending. Let it be lengths
  3. For each query, search in both intervals using binary search, and in lengths using linear search. The search will be done in parallel1, and the first that finishes up - will cause the other one to stop as well.

Because in step 3, we may do up to c*log(n) steps of binary search in intervals, we will do up to 2*c*log(n) steps overall. If we find the element faster in lengths, it will cause the binary search to end in the middle, and we will get the reduced number of operations, (but with a double constant then the original approach).


(1) No parallel computer is needed, it can be achieved by searching one step in each, simulating parallel computing on a single thread, until an answer is found. (General information, not needed for understanding the answer: The concept of "searching in parallel" was introduced by S.Even, A.Itai and A.Shamir in their article On the Complexity of TimeTable and Multi Commodity Flow Problems)

like image 122
amit Avatar answered Sep 20 '22 10:09

amit


You could prioritise certain nodes in binary search tree and still have it balanced:

Start with your combined approach (place longer intervals closer to the top) and employ a series of rotations to make the binary search tree balanced (while breaking some prioritisation, it should preserve high-priority nodes close to the root if they don't break the balance).

To balance a tree, try this strategy (untested):

  • Balance the left half (subtree of the root)
  • Balance the right half
  • If the right half is higher than the left half by more than 1 (AVL balance condition
    • While this is true
      • If the right-left quarter (left half of right half) is higher than the right-right quarter, rotate the right subtree to the right (start an AVL double rotation)
      • Rotate the tree to the left (complete an AVL rotation)
      • Balance the left half (both left quarters are already balanced - start with #3)
  • Else if the left half is higher than the right half by more than 1
    • do the converse of the other case

This should produce an AVL-balanced tree that respects the original priorities where possible.

like image 36
John Dvorak Avatar answered Sep 19 '22 10:09

John Dvorak