Find leftest duplicate of binary search result

Question

Say I have an ordered array with lots of duplicates:

var array = [ 1, 1, 1, 1, 1,
              2, 2, 2, 2, 2,
              3, 3, 3, 3, 3,
              4, 4, 4, 4, 4,
              5, 5, 5, 5, 5, ];

I also have code to perform a binary search for the index of the closest value within a sorted array:

function binaryClosestIndexOf(array, value) {
  var mid,
    lo = 0,
    hi = array.length - 1;

  while (hi - lo > 1) {
    mid = (lo + hi) >>> 1;

    if (array[mid] > value)
      hi = mid;
    else
      lo = mid;
  }

  if (value - array[lo] <= array[hi] - value)
    return lo;
  else 
    return hi;
}

Performing a few example searches unveils my issue:

binaryClosestIndexOf(array, 3.5);
> 14 // array[14] = 3
binaryClosestIndexOf(array, 3.50001);
> 15 // array[15] = 4
binaryClosestIndexOf(array, 3.9);
> 15 // array[15] = 4
binaryClosestIndexOf(array, 4);
> 19 // array[19] = 4
binaryClosestIndexOf(array, 4.49999);
> 19 // array[19] = 4

As we can see, there is no issue with the algorithm, it does return the closest value. But it returns an interesting mixture of indices, from the leftest to the rightest.

I want to get the leftest duplicate index. I could introduce an O(n) search after the binary search, iterating through each value in the array after until a value is found that is smaller than the current value. I don't want to do this.

Is there a way to elegantly perform a binary search that will end up with the leftest duplicate value? Bonus points for an algorithm for the rightest value, too!

Loren · Accepted Answer

Being a binary search, if you search for an exact value, you are not promised any location (rightest or leftest), it could be in the middle.

Since binary search works by having a sorted list and reducing by factors of two finding an edge index could be difficult.

I can think of two approaches

use a loop afterwards, I think you could make that to be expected O(log(n)) using randomness as you could say the final loop would be expected constant time O(1).
Use a second binary search (once you know the value) for the index closest to that number minus 0.000001 (in your list 4 cases this would always result in the second run searching for 3.99999, which would yield 15. Note: You should check in case the number (3.999999) was in the list and move right one place to get your value unless you can ensure a certain degree of rounding in the list. This would be 2*log(n) or O(log(n)).

If your list is long, I think the expected run time for option 2 would actually be longer than option 1 because 2*log(n) would be > log(n) + a constant unless you know there will be lots of duplicates.

Find leftest duplicate of binary search result

Tags:

javascript

arrays

algorithm

binary-search

Nick Bull

1 Answers

Loren

Recent Activity

Donate For Us

Find leftest duplicate of binary search result

Tags:

javascript

arrays

algorithm

binary-search

Nick Bull

1 Answers

Loren

Related questions

Recent Activity

Donate For Us