Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I optimize binary search?

I have been thinking on how my binary search can be optimized. The code follows. What I have done so far:

All I could think of was in terms of handling different inputs.

Optimized worst case(one of the worst cases) when element being searched is out of bounds, i.e. searching a number lower than the lowest or higher than the highest. This saves O(logn) comparisions when it is a guarantee it won't be found in the input.

int input[15] = {1,2,2,3,4,5,5,5,5,6,7,8,8,9,10};
/*
 * Returns index p if a[p] = value else -ve int.
 * value is value being searched in input.
 */
int
binary_search (int * input, int low, int high, int value)
{
    int mid = 0;
    if (!input) return -1;
    if (low > high) return -1;

    /* optimize worst case: value not in input */
    if ((value < input[low]) || (value > input[high]))
    { return -2; }

    mid = (low + high)/2;

    if (input[mid] == value) {
        return mid;
    }
    if (input[mid] > value) {
        return binary_search(input, low, mid -1, value);
    } else {
        return binary_search(input, mid+1, high, value);
    }
}

Another worst case I can think of is when value being searched is next to the mid of the input or the first element. I think more generalized is the lower / higher bounds of input to each call of binary_search. This also requires the algorithm to take exact logn comparisons.

Any other suggestions on what other areas I can focus on improving. I don't need the code but a direction would be helpful. Thanks.

like image 261
SeattleOrBayArea Avatar asked Jun 11 '26 14:06

SeattleOrBayArea


2 Answers

Jon Bentley's Programming Pearls has a nice chapter on optimizing binary search. See Chapter 4 in http://www.it.iitb.ac.in/~deepak/deepak/placement/Programming_pearls.pdf

One of the variants is amazingly efficient (see page 87 in the chapter on "Code Tuning"):

# Search a 1000-element array
l = 0
if x[512] < t: l = 1000 + 1 - 512
if x[l+256] < t: l += 256
if x[l+128] < t: l += 128
if x[l+64] < t: l += 64
if x[l+32] < t: l += 32
if x[l+16] < t: l += 16
if x[l+8] < t: l += 8
if x[l+4] < t: l += 4
if x[l+2] < t: l += 2
if x[l+1] < t: l += 1
p = l + 1
if p > 1000 or x[p] != t:
    p = 0                   # Not Found
like image 111
Raymond Hettinger Avatar answered Jun 13 '26 15:06

Raymond Hettinger


An optimization of the sort you're considering -- handling a special case -- will inevitably make you spend more time in the OTHER cases. Your "worst case" optimizations have made them into the best cases, but at the cost of creating other worst cases. And in this instance you've made two cases into "best cases", and n/2 cases into "worst cases" which previously were not. You've slowed everything else down.

(Especially in this instance, because you're checking for too low / too high on every single recursion.)

If you actually expect -- in your particular use case -- that the search will mostly be searching for values that are too low or too high, this might be a good idea. As a general rule of thumb, though, the fastest implementation is the simplest one.

like image 22
Sneftel Avatar answered Jun 13 '26 16:06

Sneftel