Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The intersection of two sorted arrays

Given two sorted arrays: A and B. The size of array A is La and the size of array B is Lb. How to find the intersection of A and B?

If La is much bigger than Lb, then will there be any difference for the intersection finding algorithm?

like image 922
user288609 Avatar asked Mar 08 '10 08:03

user288609


People also ask

What is the intersection of two arrays?

The intersection is a list of common elements present in both arrays. The elements in the output can be in any order.

How do you find the intersection of two arrays in C?

Here, we created two arrays arr1, arr2 with 5 integer elements. Then we find the intersection of both arrays using the intersection() function and assign the result into the arr3 array. The intersection() function is a user-defined function. After that, we printed the intersected elements on the console screen.

What is union and intersection of array?

The union of two arrays will contain all the elements of the two arrays, the common elements will appear only once instead of twice. The intersection of the two arrays will contain the common elements of the two arrays.


2 Answers

Since this looks like a HW...I'll give you the algorithm:

Let arr1,arr2 be the two sorted arrays of length La and Lb.
Let i be index into the array arr1.
Let j be index into the array arr2.
Initialize i and j to 0.

while(i < La and j < Lb) do

    if(arr1[i] == arr2[j]) { // found a common element.
        print arr[i] // print it.
        increment i // move on.
        increment j
    }
    else if(arr1[i] > arr2[j])
        increment j // don't change i, move j.
    else
        increment i // don't change j, move i.
end while
like image 116
codaddict Avatar answered Oct 12 '22 00:10

codaddict


I've been struggling with same problem for a while now, so far I came with:

  1. Linear matching which will yield O(m+n) in worst case. You basically keep two pointers (A and B) each pointing to beginning of each array. Then advance pointer which points to smaller value, until you reach end of one of arrays, that would indicate no intersection. If at any point you have *A == *B - here comes your intersection.

  2. Binary matching. Which yields ~ O(n*log(m)) in worst case. You basically pick smaller array and perform binary search in bigger array of all elements of the smaller array. If you want to be more fancy, you can even use last position where binary search failed and use it as starting point for next binary search. This way you marginally improve worst case but for some sets it might perform miracles :)

  3. Double binary matching. It's a variation of regular binary matching. Basically you get element from the middle of smaller array and do binary search in bigger array. If you find nothing then you cut smaller array in half (and yes you can toss element you already used) and cut bigger array in half (use binary search failure point). And then repeat for each pair. Results are better than O(n*log(m)) but I am too lazy to calculate what they are.

Those are two most basic ones. Both have merits. Linear is a bit easier to implement. Binary one is arguably faster (although there are plenty of cases when linear matching will outperform binary).

If anyone knows anything better than that I would love to hear it. Matching arrays is what I do these days.

P.S. don't quote me on terms "linear matching" and "binary matching" as I made them up myself and there are probably fancy name for it already.

like image 43
Nazar Avatar answered Oct 12 '22 00:10

Nazar