Here is one way of doing this in O(m+n)
where m
and n
are lengths of two arrays:
import random
def comm_seq(arr_1, arr_2):
if len(arr_1) == 0 or len(arr_2) == 0:
return []
m = len(arr_1) - 1
n = len(arr_2) - 1
if arr_1[m] == arr_2[n]:
return comm_seq(arr_1[:-1], arr_2[:-1]) + [arr_1[m]]
elif arr_1[m] < arr_2[n]:
return comm_seq(arr_1, arr_2[:-1])
elif arr_1[m] > arr_2[n]:
return comm_seq(arr_1[:-1], arr_2)
if __name__ == "__main__":
arr_1 = [random.randrange(0,5) for _ in xrange(10)]
arr_2 = [random.randrange(0,5) for _ in xrange(10)]
arr_1.sort()
arr_2.sort()
print comm_seq(arr_1, arr_2)
Is there a technique that in some cases uses less than O(m+n)
comparisons? For example: arr_1=[1,2,2,2,2,2,2,2,2,2,2,100]
and arr_2=[1,3,100]
(Not looking for the hash table implementation)
A binary search algorithm requires O(logm)
time to find a number in an array with length m.
Therefore, if we search each number of an array with length n from an array with length m, its overall time complexity is O(nlogm)
. If m is much greater than n, O(nlogm)
is actually less than O(m+n)
. Therefore, we can implement a new and better solution based on binary search in such a situation. source
However, this does not necessarily means binary search is better in than O(m+n) case. In fact, binary search approach is only better when n << m (n is very small compared to m).
As far as I know, there are a few different ways to solve this problem, but none of them are better than O(m + n). I don't know how you can have an algorithm faster than that (barring weird quantum computing answers), because you have to compare all the elements in both arrays or you might miss a duplicate.
Brute Force Use two nested for loops. Take every element from the first array and linear search it in the second array. O(M*N) time, O(1) space
Map Lookup Use a lookup structure like a hashtable or a binary search tree. Put all of the first array into the map structure, then loop through all of the second array and look up each element in the map to see if it exists. This works whether the arrays are sorted or not. O(M*log(M) + N*log(M)) for Binary Search Tree time or O(M + N) time for Hashtable, both are O(M) space.
Binary Search Like brute force, but take every element from the first array and binary search it in the second array. O(m*log(N)) time, O(1) space
Parallel Walk Like the merge part of merge sort. Have two pointers start at the front of each of the arrays. Compare the two elements, if they're equal store the duplicate, otherwise advance the pointer to the smaller value by one spot and repeat until you hit the end of one of the arrays. O(M + N) time, O(1) space
Regardless, you must examine every element in both arrays or you won't know if you've found all the duplicates. You could argue fringe cases where one array is a lot bigger or a lot smaller, but that won't hold for an alogrithm where you're considering all ranges of input.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With