Is there a way to find all the duplicate elements in an array of N elements in O(N) time?
Example:
Input: 11, 29, 81, 14, 43, 43, 81, 29
Output: 29, 81, 43
Sorting the input and doing a linear scan to detect duplicates destroys the order and gives the output: 29,43,81.
Sorting-by-key another array of indices {0,1,...N-1}
according to the given array to get {1,4,2}
and then sorting the resultant set of indices to get {1,2,4}
will give us {29,81,43}
, but this takes O(N logN)
time.
Is there an O(N) algorithm to solve this problem?
P.S. I forgot to add: I dont want to use hash tables. I am looking for a non-hash solution.
Duplicate elements can be found using two loops. The outer loop will iterate through the array from 0 to length of the array. The outer loop will select an element. The inner loop will be used to compare the selected element with the rest of the elements of the array.
So the time complexity is O(n).
The two common approaches to solve the problem are: Using a HashSet - populate it while iterating and abort if you find a match - O(n) time on average and O(n) space. Sort and iterate, after the array is sorted, duplicates will be adjacent to each other and easy to detect.
I believe a good solution (decent memory usage, can be used to immediately determine if an entry has already been seen thus preserving order, and with a linear complexity) is a trie.
If you insert the elements into the trie as if they were a string with each digit (starting from the MSD) in each node, you can pull this off with a complexity of O(m N) where m is the average length of numbers in base-10 digits.
You'd just loop over all your entries and insert them into the trie. Each time an element already exists, you skip it and move on to the next. Duplicates in this (unlike in my previous answer of a Radix Sort) will be found immediately instead of in the last iteration or what not.
I'm not sure if you would benefit from using a suffix tree here, as the "base" of the characters being entered into the trie is only 10 (compared to the base-128 for ANSI strings), but it's possible.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With