Does anyone know a faster-than-linear algorithm for finding a duplicate in a sequential list of numbers? I'm working in Java now but any language or psuedo-code is fine.
For example, given this int[] input:
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 7 | 8 | 9
Output would be either index or value '7'.
I know the obvious traversal at O(n)
linear time, but I'm trying to see if this is possible via binary search at O(log n)
time.
Short answer: you don't, it is not its purpose. A binary search only gives you the position of the value you want, or the position of 1 of them if duplicated. To display all duplicates and indexes, you need to do a secondary search around the position returned by binary search routine.
Binary Search is a searching algorithm for finding an element's position in a sorted array. In this approach, the element is always searched in the middle of a portion of an array. Binary search can be implemented only on a sorted list of items. If the elements are not sorted already, we need to sort them first.
Binary search looks for a particular item by comparing the middle most item of the collection. If a match occurs, then the index of item is returned. If the middle item is greater than the item, then the item is searched in the sub-array to the left of the middle item.
If you assume the numbers must start at 0 and be increasing by 1 you can compare the middle to the index. If the middle is the same go higher, if the middle is not, go lower.
This will give you binary search time O(log2 N). The only difference is that you are comparing with the index, rather than a fixed value.
public static void main(String... args) {
int[] array = {0, 1, 2, 3, 4, 5, 6, 7, 7, 8, 9};
int duplicate = findDuplicate(array);
System.out.println(duplicate);
}
private static int findDuplicate(int[] array) {
int low = 0;
int high = array.length - 1;
while (low <= high) {
int mid = (low + high) >>> 1;
int midVal = array[mid];
if (midVal == mid)
low = mid + 1;
else
high = mid - 1;
}
return high;
}
Notice that binary search is meant to work on sorted lists. So if you have a sorted list with duplicates, binary search will only be useful if your duplicates are adjacent. The importance of being adjacent is so that you can test the presence of the key at the previous and next position of the found key. Any other way of trying to use binary search on unsorted lists will give incorrect results.
Here is a bit of code to show what I mean.
import java.util.Arrays;
public class Main {
public static void main(String[] args) {
int[] list = {1, 2, 3, 4, 5, 6, 7, 7, 8, 9 };
int key = 7;
int result = Arrays.binarySearch(list, key);
System.out.println(result);
if( list[result+1] == key || list[result-1] == key )
System.out.println("yes we have a duplicate.");
}
}
The comparison in the if
being O(1) we have remain with the O(logn) of binary search.
public class DuplicateNumber {
public int findDuplicateNumber(List<Integer> numbers){
int highestNumber = numbers.size() - 1;
int total = getSum(numbers);
int duplicate = total - (highestNumber*(highestNumber+1)/2);
return duplicate;
}
public int getSum(List<Integer> numbers){
int sum = 0;
for(int num:numbers){
sum += num;
}
return sum;
}
public static void main(String a[]){
List<Integer> numbers = new ArrayList<Integer>();
for(int i=1;i<30;i++){
numbers.add(i);
}
//add duplicate number into the list
numbers.add(22);
DuplicateNumber dn = new DuplicateNumber();
System.out.println("Duplicate Number: "+dn.findDuplicateNumber(numbers));
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With