A couple of days ago, in an interview I was asked for a program that would find the single duplicate element in a consecutive integer array in O(log n)
time.
The case is somewhat specific, as there are a total of 11 integers (1 to 10, in that order) plus, a single duplicate of any of these numbers, inserted somewhere in between.
I was given a sample similar to this:
{1, 2, 3, 4, 5, 6, 2, 7, 8, 9, 10}
So, now I've come up with the following C code:
#include <stdio.h>
int main (void)
{
int a[11] = {1, 2, 3, 4, 5, 6, 2, 7, 8, 9, 10}, first=0, last=10, mid;
while (1)
{
mid = (first + last)/2;
if (mid+1 == a[mid])
first = mid+1;
else if ((mid == 0) || (mid == 10) || (a[mid+1] - a[mid-1] == 1)) /* Order is Important, (a[mid+1] - a[mid-1] == 1) must be last */
break;
else
last = mid-1;
}
printf("Duplicate element is %d, located at position no. %d.", a[mid], mid+1);
return 0;
}
Would this properly satisfy the O(log n)
criteria? And, are there any alternatives/improvements over this?
Yes, it has O(log n)
time complexity.
It is possible to make the code more clear using the following fact: you need to find the smallest i
such that a[i] != i + 1
, so it can be implemented in a more concise way:
//invariant: the [0...low] prefix does not contain a duplicate
// the [0...high] prefix contains a duplicate
low = 0 //[0...0] prefix obviously does not contain a duplicate
high = 10 //the entire array obviously contains a duplicate
while high - low > 1:
mid = (low + high) / 2
if a[mid] != mid + 1:
high = mid
else:
low = mid
print(a[high], high)
We can modify the binary search algorithm
to get the solution. In binary search
, we have a key and we used this key to find its position by bisecting the array size. Here, we don't have a key, instead we have to find it. But the behavior of duplicate element
can be used to bisect
the array size. How ? lets see:
On carefully looking the data, we can easily see that after inserting the duplicate element at random position (say index
) in consecutive element array the property of elements will change (a[i] == i+1
--> a[i] != i+1
) from the position index
(including the index
). Now this changed property can be used to bisect the array size. Hence, we can find out the duplicate in O(log(n))
.
For example, Consider your given array: {1, 2, 3, 4, 5, 6, 2, 7, 8, 9, 10}
{1, 2, 3, 4, 5, 6, 2, 7, 8, 9, 10}
||
from this position the the property of (a[i] == i+1) will no more satisfied.
This property can be model to bisect the array size in solution.
void binary_duplictae_finder(int a[], int low, int high) {
int mid=(low+high)/2;
if(high - low > 1){
if(a[mid]!=mid+1)
binary_duplictae_finder(a, low, mid);
else
binary_duplictae_finder(a, mid, high);
}
if(high==low+1)
printf("%d ", a[high]);
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With