Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find kth largest number in pairwise sums like setA + setB?

Tags:

algorithm

Here are two integers set, say A and B and we can get another set C, in which every element is sum of element a in A and element b in B.

For example, A = {1,2}, B = {3,4} and we get C = {4, 5, 6} where 4=1+3, 5=1+4=2+3, 6=2+4

Now I want to find out which number is the kth largest one in set C, for example 5 is 2nd largest one in above example.

Is there a efficient solution?

I know that pairwise sums sorting is an open problem and has a n^2 lower time bound. But since only kth largest number is needed, maybe we can learn from the O(n) algorithm of finding median number in an unsorted array.

Thanks.

like image 626
ibread Avatar asked Sep 17 '09 04:09

ibread


2 Answers

If k is very close to 1 or N, any algorithm that generates the sorted sums lazily could simply be run until the kth or N-kth item pops out.

In particular, I'm thinking of best-first search of the following space: (a,b) means the ath item from A, the first list, added to the bth from B, the second.

Keep in a best=lowest priority queue pairs (a,b) with cost(a,b) = A[a]+B[b].

Start with just (1,1) in the priority queue, which is the minimum.

Repeat until k items popped: 
 pop the top (a,b)
 if a<|A|, push (a+1,b)
 if a=1 and b<|B|, push (a,b+1)

This gives you a saw-tooth comb connectivity and saves you from having to mark each (a,b) visited in an array. Note that cost(a+1,b)>=cost(a,b) and cost(a,b+1)>=cost(a,b) because A and B are sorted.

Here's a picture of a comb to show the successor generation rule above (you start in the upper left corner; a is the horizontal direction):

|-------
|-------
|-------

It's just best-first exploration of (up to) all |A|*|B| tuples and their sums.

Note that the most possible items pushed before popping k is 2*k, because each item has either 1 or 2 successors. Here's a possible queue state, where items pushed into the queue are marked *:

|--*----
|-*-----
*-------

Everything above and to the left of the * frontier has already been popped.

For the N-k<k case, do the same thing but with reversed priority queue order and exploration order (or, just negate and reverse the values, get the (N-k)th least, then negate and return the answer).

See also: sorted list of pairwise sums on SO, or the Open problems project.

like image 127
Jonathan Graehl Avatar answered Oct 27 '22 10:10

Jonathan Graehl


Sort arrays A & B : O(mlogm + nlogn) Apply a modified form of algorithm for merging 2 sorted arrays : O(m+n) i.e. at each point, u sum the the two elements. When u have got (m+n-k+1)th element in C, stop merging. That element is essentially kth largest. E.g. {1,2} & {3,4} : Sorted C: {1+3,(1+4)|(2+3),2+4}

like image 4
avd Avatar answered Oct 27 '22 10:10

avd