Say I have an ordered array with lots of duplicates:
var array = [ 1, 1, 1, 1, 1,
2, 2, 2, 2, 2,
3, 3, 3, 3, 3,
4, 4, 4, 4, 4,
5, 5, 5, 5, 5, ];
I also have code to perform a binary search for the index of the closest value within a sorted array:
function binaryClosestIndexOf(array, value) {
var mid,
lo = 0,
hi = array.length - 1;
while (hi - lo > 1) {
mid = (lo + hi) >>> 1;
if (array[mid] > value)
hi = mid;
else
lo = mid;
}
if (value - array[lo] <= array[hi] - value)
return lo;
else
return hi;
}
Performing a few example searches unveils my issue:
binaryClosestIndexOf(array, 3.5);
> 14 // array[14] = 3
binaryClosestIndexOf(array, 3.50001);
> 15 // array[15] = 4
binaryClosestIndexOf(array, 3.9);
> 15 // array[15] = 4
binaryClosestIndexOf(array, 4);
> 19 // array[19] = 4
binaryClosestIndexOf(array, 4.49999);
> 19 // array[19] = 4
As we can see, there is no issue with the algorithm, it does return the closest value. But it returns an interesting mixture of indices, from the leftest to the rightest.
I want to get the leftest duplicate index. I could introduce an O(n) search after the binary search, iterating through each value in the array after until a value is found that is smaller than the current value. I don't want to do this.
Is there a way to elegantly perform a binary search that will end up with the leftest duplicate value? Bonus points for an algorithm for the rightest value, too!
Being a binary search, if you search for an exact value, you are not promised any location (rightest or leftest), it could be in the middle.
Since binary search works by having a sorted list and reducing by factors of two finding an edge index could be difficult.
I can think of two approaches
If your list is long, I think the expected run time for option 2 would actually be longer than option 1 because 2*log(n) would be > log(n) + a constant unless you know there will be lots of duplicates.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With