We have been given an array of size N that contains integers in the range 0 to N-2, both inclusive.
The array can have multiple repeated entries. We need to find one of the duplicated entries in O(N) time and constant space.
I was thinking of taking the product and sum of all the entires in the array, and the product and sum of all the numbers in the range 0 to N-2.
Then, the difference of the sums and the division of the products would give us two equations. This approach would work if it were given that there are only two repeated entries, but since there can be more than two, I think my approach fails.
Any suggestions?
Edit: The array is immutable. I realize that this is an important piece of information and I apologize that I forgot to include this earlier.
Here's a nice treatment. It passes through some easier problems before addressing this one.
http://aperiodic.net/phil/archives/Geekery/find-duplicate-elements.html
It contains a solution for when you can modify the input array, and another for when you can't.
Brief summary in case the link ever dies: the array indexes run from 0 .. N-1, and the array values run 0 .. N-2. Each array element can therefore be considered as an index (or "pointer") into the array itself: element i
"points to" element ra[i]
, ra[i]
points to ra[ra[i]]
and so on. By repeatedly following these pointers, me must eventually enter a cycle, because we certainly can't go forever without revisiting some node or other.
Now, the very last element, N-1, isn't pointed to by any other element. So if we start there and eventually enter a cycle, somewhere along the way there must be an element which can be reached from two different places: the route we took the first time, and the route which is part of the cycle. Something like this:
N-1 -> a1 -> a2 -> a3
^ \
/ v
a6 <- a5 <- a4
In this case, a2 is reachable from two different places.
But a node which is reachable from two different places is precisely what we're looking for, a duplicate in the array (two different array elements containing the same value).
The question then is how to identify a2, and the answer is to use Floyd's cycle-finding algorithm. In particular it tells us the "start" of the loop in O(N) time and O(1) space.
Assuming we are allowed to change array in place swap each element as you go through the array with the element on that "position" (e.g. if current element is curr then swap it with a[curr]) but if a[curr] already has curr then you know curr is a duplicate.
a = array...
for i = 0; i < length(a); i++
curr = a[i]
if a[curr] == curr:
return duplicate curr
swap(a[i], a[curr])
# Now a[curr] == curr and so if it happens again we know it is a duplicate.
This would be O(n) and constant space.
Scan the array and add each element to a set. If the item already exists in the set - you have a dupe.
Initialize a bit array of size N-2 with all entries to 0. Each index will represent all of your items in the range 0 to N-2.
Loop through your array and add items to your bitarray by setting bitarray[number] == 1
. If element already contains a 1, then you've already added your element, return immediately.
If you get to the end of your array without finding a duplicate, return -1.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With