Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding the kth-smallest element in union of sorted arrays

I was studying the article on finding the kth-smallest element in the union of two sorted arrays at leetcode. I don't think that the algorithm is correct. There is this line: We make an observation that when Ai < Bj, then it must be true that Ai < Bj-1. On the other hand, if Bj < Ai, then Bj < Ai-1.. How can it be true for any i and j?

And secondly, this line also baffles me: We try to approach this tricky problem by comparing middle elements of A and B, which we identify as Ai and Bj. If Ai is between Bj and Bj-1, we have just found the i+j+1 smallest element, although it comes out to be true. Can anyone explain the reason? I really want to understand the algo, I have done it by merging the arrays, but that takes O(N) time, compared to O(log N) time here.

like image 654
SexyBeast Avatar asked Dec 20 '22 16:12

SexyBeast


1 Answers

You're interpreting these statements in isolation, but they build on one another. Here's the text that (I think) you're referring to:

Maintaining the invariant i + j = k – 1, If Bj-1 < Ai < Bj, then Ai must be the k-th smallest, or else if Ai-1 < Bj < Ai, then Bj must be the k-th smallest. If one of the above conditions are satisfied, we are done. If not, we will use i and j as the pivot index to subdivide the arrays. But how? Which portion should we discard? How about Ai and Bj itself?

We make an observation that when Ai < Bj, then it must be true that Ai < Bj-1. On the other hand, if Bj < Ai, then Bj < Ai-1. Why?

Breaking this down into sub-propositions yields the following interpretation (keeping in mind that indexing starts from 0, but that A0 is the first smallest item, and A1 is the second smallest item, and so on):

  1. i + j = k - 1 (invariant, by definition)
  2. Posit that Bj-1 < Ai < Bj. Then Ai must be the kth smallest. This is because Ai is greater than i items in A and is greater than j items in B. So it's greater than a total of i + j = k - 1 items. That means its index in a merged A|B list would be k - 1, and so it would be the kth item in that list.
  3. Posit that Ai-1 < Bj < Ai. Then Bj must be the kth smallest, by the same line of reasoning in 2.
  4. Now posit that both (a) Bj-1 < Ai < Bj and (b) Ai-1 < Bj < Ai are false. Then it follows quite obviously that if Ai < Bj then A1 < Bj-1, because otherwise (a) would be true. And likewise, if Bj < Ai then Bj < Ai-1, because otherwise, (b) would be true.

I'm taking you at your word that you want an explanation of these statements rather than of the algorithm as a whole. (But I'll say more if you like.)

Note also that, as Daniel Fischer's answer reminds me, the above reasoning only holds if there are no duplicates; call that proposition 0.

like image 171
senderle Avatar answered Mar 15 '23 22:03

senderle