Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find common elements in N sorted arrays with no extra space

Given N arrays with sizeof N, and they are all sorted, if it does not allow you to use extra space, how will find their common datas efficiently or with less time complexity?

For ex:

 1. 10 160 200 500 500
 2. 4 150 160 170 500
 3. 2 160 200 202 203
 4. 3 150 155 160 300
 5. 3 150 155 160 301

This is an interview question, I found some questions which were similar but they didn't include the extra conditions of input being sorted or not being able to use extra memory.

I couldn't think of any solution less than O(n^2 lg n) complexity. In that case, I'd prefer to go with the simplest solution which gives me this complexity, which is:

  not_found_flag = false

  for each element 'v' in row-1
       for each row 'i' in the remaining set
           perform binary search for 'v' in 'i'
           if 'v' not found in row 'i'
                 not_found_flag = true
                 break
       if not not_found_flag 
           print element 'v' as it is one of the common element 

We could improve this by comparing the min and max of each row and decide based on that whether it is possible for a number 'num' to fall between 'min_num' and 'max_num' of that row.

Binary search -> O(log n) For searching 1 num in n-1 rows : O(nlogn) Binary search for each number in first row : O(n2logn)

I selected first row, we can pick any row and if a no element of the row picked is found in any of the (N-1) rows then we don't really have common data.

like image 393
user1071840 Avatar asked Feb 23 '13 03:02

user1071840


2 Answers

It seems this can be done in O(n^2); i.e., just looking at each element once. Note that if an element is common to all the arrays then it must exist in any one of them. Also for purposes of illustration (and since you used the for loop above) I will assume we can keep an index for each of the arrays, but I'll talk about how to get around this later.

Let's call the arrays A_1 through A_N, and use indices starting at 1. Pseudocode:

# Initial index values set to first element of each array
for i = 1 to N:
  x_i = 1 

for x_1 = 1 to N:
  val = A_1[x_1] 
  print_val = true
  for i = 2 to N:
    while A_i[x_i] < val:
      x_i = x_i + 1
    if A_i[x_i] != val:
      print_val = false
  if print_val:
    print val

Explanation of algorithm. We use the first array (or any arbitrary array) as the reference algorithm, and iterate through all the other arrays in parallel (kind of like the merge step of a merge sort, except with N arrays.) Every value of the reference array that is common to all the arrays must be present in all the other arrays. So for each other array (since they are sorted), we increase the index x_i until the value at that index A_i[x_i] is at least the value we are looking for (we don't care about lesser values; they can't be common.) We can do this since the arrays are sorted and thus monotonically nondecreasing. If all the arrays had this value, then we print it, otherwise we increment x_1 in the reference array and keep going. We have to do this even if we don't print the value.

By the end, we've printed all the values that are common to all the arrays, while only having examined each element once.

Getting around the extra storage requirement. There are many ways to do this, but I think the easiest way would be to check the first element of each array and take the max as the reference array A_1. If they are all the same, print that value, and then store the indices x_2 ... x_N as the first element of each array itself.

Java implementation (for brevity, without the extra hack), using your example input:

public static void main(String[] args) {
    int[][] a = {
            { 10, 160, 200, 500, 500, },
            { 4, 150, 160, 170, 500, },
            { 2, 160, 200, 202, 203, },
            { 3, 150, 155, 160, 300 },
            { 3, 150, 155, 160, 301 } };

    int n = a.length;
    int[] x = new int[n];

    for( ; x[0] < n; x[0]++ ) {
        int val = a[0][x[0]]; 
        boolean print = true;
        for( int i = 1; i < n; i++ ) {
            while (a[i][x[i]] < val && x[i] < n-1) x[i]++;              
            if (a[i][x[i]] != val) print = false;               
        }   
        if (print) System.out.println(val);
    }   
}

Output:

160
like image 158
2 revs Avatar answered Nov 05 '22 09:11

2 revs


This is a solution in python O(n^2), uses no extra space but destroys the lists:

def find_common(lists):
    num_lists = len(lists)
    first_list = lists[0]
    for j in first_list[::-1]:
        common_found = True
        for i in range(1,num_lists):
            curr_list = lists[i]
            while curr_list[len(curr_list)-1] > j:
                curr_list.pop()
            if curr_list[len(curr_list)-1] != j:
                common_found = False
                break
        if common_found:
            return j
like image 39
user2214455 Avatar answered Nov 05 '22 09:11

user2214455