Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find the kth smallest element in the union of two sorted arrays?

This is a homework question, binary search has already been introduced:

Given two arrays, respectively N and M elements in ascending order, not necessarily unique:
What is a time efficient algorithm to find the kth smallest element in the union of both arrays?

They say it takes O(logN + logM) where N and M are the arrays lengths.

Let's name the arrays a and b. Obviously we can ignore all a[i] and b[i] where i > k.
First let's compare a[k/2] and b[k/2]. Let b[k/2] > a[k/2]. Therefore we can discard also all b[i], where i > k/2.

Now we have all a[i], where i < k and all b[i], where i < k/2 to find the answer.

What is the next step?

like image 617
Michael Avatar asked Jan 05 '11 18:01

Michael


People also ask

How do you find KTH smallest or largest element in an unsorted array?

Approach: Using Max Heap Observe the following algorithm. Step 1: Using the first k elements of the input array (a[0], …, a[k - 1], create a Max-Heap. Step 2: Compare each element that is coming after the k'th element (a[k] to a[n - 1]) with the root element of the max-heap.


2 Answers

I hope I am not answering your homework, as it has been over a year since this question was asked. Here is a tail recursive solution that will take log(len(a)+len(b)) time.

Assumption: The inputs are correct, i.e., k is in the range [0, len(a)+len(b)].

Base cases:

  • If length of one of the arrays is 0, the answer is kth element of the second array.

Reduction steps:

  • If mid index of a + mid index of b is less than k:
    • If mid element of a is greater than mid element of b, we can ignore the first half of b, adjust k.
    • Otherwise, ignore the first half of a, adjust k.
  • If k is less than sum of mid indices of a and b:
    • If mid element of a is greater than mid element of b, we can safely ignore second half of a.
    • Otherwise, we can ignore second half of b.

Code:

def kthlargest(arr1, arr2, k):     if len(arr1) == 0:         return arr2[k]     elif len(arr2) == 0:         return arr1[k]      mida1 = len(arr1) // 2  # integer division     mida2 = len(arr2) // 2     if mida1 + mida2 < k:         if arr1[mida1] > arr2[mida2]:             return kthlargest(arr1, arr2[mida2+1:], k - mida2 - 1)         else:             return kthlargest(arr1[mida1+1:], arr2, k - mida1 - 1)     else:         if arr1[mida1] > arr2[mida2]:             return kthlargest(arr1[:mida1], arr2, k)         else:             return kthlargest(arr1, arr2[:mida2], k) 

Please note that my solution is creating new copies of smaller arrays in every call, this can be easily eliminated by only passing start and end indices on the original arrays.

like image 184
lambdapilgrim Avatar answered Sep 18 '22 00:09

lambdapilgrim


You've got it, just keep going! And be careful with the indexes...

To simplify a bit I'll assume that N and M are > k, so the complexity here is O(log k), which is O(log N + log M).

Pseudo-code:

i = k/2 j = k - i step = k/4 while step > 0     if a[i-1] > b[j-1]         i -= step         j += step     else         i += step         j -= step     step /= 2  if a[i-1] > b[j-1]     return a[i-1] else     return b[j-1] 

For the demonstration you can use the loop invariant i + j = k, but I won't do all your homework :)

like image 28
Jules Olléon Avatar answered Sep 20 '22 00:09

Jules Olléon