Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do we use mid = low + (high – low)/2; but not mid = (low/2) +(high /2)?

In binary search, we use mid = low + (high – low)/2 instead of (low + high)/2 to avoid overflow, however, can't calculate low/2 and high/2 separately and then sum them up rather than low+(( high-low)/2)?

P.S. If low + (high – low)/2 is more efficient, then why is it so?

like image 970
ishan pandey Avatar asked Oct 27 '25 03:10

ishan pandey


2 Answers

Let's say both low and high are 3; then middle = 3/2 + 3/2 = 1+1 = 2, which actually is quite bad. :-)

The reason we don't use middle=high/2+low/2 is that it gives us the wrong result

like image 148
Dan Byström Avatar answered Oct 29 '25 10:10

Dan Byström


If you do (low + high)/2, there is a possibility of integer overflow when low and high are large integers. This can lead to incorrect results or undefined behavior. By subtracting low from high first, we prevent overflow since the difference is likely to be smaller.

Suppose low = 2,147,483,640 and high = 2,147,483,645 (just below the maximum value of a 32-bit signed integer, which is 2,147,483,647).

If we use the expression (low + high)/2 directly, it would look like this:

mid = (2147483640 + 2147483645)/2
    = 4294967285/2
    = 2147483642

This result seems correct. However, let's see what happens when low and high are increased by 6:

Suppose low = 2,147,483,646 and high = 2,147,483,651.

If we use the same expression (low + high)/2, it would be:

mid = (2147483646 + 2147483651)/2
    = 4294967297/2

Now, 4294967297 exceeds the maximum value that can be stored in a 32-bit signed integer, causing an integer overflow. This results in undefined behavior. We won't get the expected midpoint value.

using (high - low)/2 to calculate the midpoint helps prevent such overflow because the difference between high and low is much smaller

like image 42
Ishita Pathak Avatar answered Oct 29 '25 08:10

Ishita Pathak