Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Binary Search Middle Value Logic

I've written binary search recursively with my own effort 💪💪. While calculating middle index, I did int mid = (low + high) / 2. I googled recursive version of binary search and found that the calculation is a bug. Because it may overflow if (low + high) is greater than maximum value of integer. Two types solutions there are.

  • int mid = low + (high - low) / 2
  • int mid = (high + low) >>> 1 (unsigned right shift, it does division by 2)

Both of them work like a charm. However, I wonder for first one how it works. Yes, it works but what the logic that the people think to find it is. For the second, how it works. Isn't there possibility to overflow for (high + low) still? I mean my attempt is called as buggy because of overflow but the second one also does (high + low). 🤔🤔🤔

like image 821
snr Avatar asked Mar 06 '23 05:03

snr


2 Answers

In (high + low) >>> 1, there can still be signed wraparound, but the >>> operator treats its left operand as unsigned so signed wraparound is not relevant. This approach works because there won't be unsigned wraparound: high and low are (as signed integers) non-negative, so as unsigned integers they cannot be large enough to cause unsigned wraparound.

That adding two signed-non-negative integers never suffers from unsigned wraparound can be seen by adding the largest possible numbers: 2k-1-1 + 2k-1-1 (where k is the size in bits of the integer type, commonly 32), which adds up to 2k-2. That doesn't wrap yet: the highest representable number is 2k-1.

So it is not the "overflow" in high + low that really causes the problem, at least in a reasonable language where signed arithmetic is safe, such as Java. It is the subsequent treatment of the result of high + low as a signed integer by a hypothetical / 2 that follows it that causes trouble.

like image 144
harold Avatar answered Mar 17 '23 03:03

harold


The reason the first one works is

  • it is mathematically identical: l + (h - l)/2 = l + h/2 - l/2 = l/2 + h/2 = (l + h)/2, and
  • because of the precedence of operations: first the parenthesis, then the division by 2, then the addition.

Funny enough though, this "solution" may still overflow on a large negative low so it is not really a solution to your original problem.

You could do something along the lines of low/2 + high/2 to avoid overflow, but you will need to explicitly write out computations for even-even (the above), even-odd, odd-even and odd-odd to remain in the int space. While doing so you can utilize >>> to avoid the actual division.

like image 22
Oleg Sklyar Avatar answered Mar 17 '23 03:03

Oleg Sklyar