Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

O(log n) algorithm to find the element having rank i in union of pre-sorted lists

Tags:

c++

algorithm

Given two sorted lists, each containing n real numbers, is there a O(log n) time algorithm to compute the element of rank i (where i coresponds to index in increasing order) in the union of the two lists, assuming the elements of the two lists are distinct?

EDIT: @BEN: This i s what I have been doing , but I am still not getting it.

I have an examples ;

List A : 1, 3, 5, 7 List B : 2, 4, 6, 8

Find rank(i) = 4.

First Step : i/2 = 2; List A now contains is A: 1, 3 List B now contains is B: 2, 4

         compare A[i] to B[i] i.e 

                 A[i] is less;

                 So the lists now become :

                   A: 3 
                   B: 2,4

Second Step: i/2 = 1

         List A now contains A:3
         List B now contains B:2 

         NoW I HAVE LOST THE VALUE 4 which is actually the result ...

I know I am missing some thing , but even after close to a day of thinking I cant just figure this one out...

like image 605
Eternal Learner Avatar asked Apr 24 '10 02:04

Eternal Learner


1 Answers

Yes:

You know the element lies within either index [0,i] of the first list or [0,i] of the second list. Take element i/2 from each list and compare. Proceed by bisection.

I'm not including any code because this problem sounds a lot like homework.

EDIT: Bisection is the method behind binary search. It works like this:

Assume i = 10; (zero-based indexing, we're looking for the 11th element overall).

On the first step, you know the answer is either in list1(0...10) or list2(0...10). Take a = list1(5) and b = list2(5).

If a > b, then there are 5 elements in list1 which come before a, and at least 6 elements in list2 which come before a. So a is an upper bound on the result. Likewise there are 5 elements in list2 which come before b and less than 6 elements in list1 which come before b. So b is an lower bound on the result. Now we know that the result is either in list1(0..5) or list2(5..10). If a < b, then the result is either in list1(5..10) or list2(0..5). And if a == b we have our answer (but the problem said the elements were distinct, therefore a != b).

We just repeat this process, cutting the size of the search space in half at each step. Bisection refers to the fact that we choose the middle element (bisector) out of the range we know includes the result.

So the only difference between this and binary search is that in binary search we compare to a value we're looking for, but here we compare to a value from the other list.

NOTE: this is actually O(log i) which is better (at least no worse than) than O(log n). Furthermore, for small i (perhaps i < 100), it would actually be fewer operations to merge the first i elements (linear search instead of bisection) because that is so much simpler. When you add in cache behavior and data locality, the linear search may well be faster for i up to several thousand.

Also, if i > n then rely on the fact that the result has to be toward the end of either list, your initial candidate range in each list is from ((i-n)..n)

like image 145
Ben Voigt Avatar answered Oct 20 '22 17:10

Ben Voigt