I'm learning algorithms/big o and i was just curious about this.
The use of
mid = (low+high)/2;
in order to get the midpoint for a binary search algorithm is generally discouraged because of the possibility of an overflow error. Why would this cause an overflow error to occur, and how does
mid = low + (high-low)/2;
prevent this error?
Thanks.
Instead we can use the below formula to calculate the mid index: mid = start + (end-start)/2; Clearly “(end-start)/2” is less than “end” and this formula will not cause integer overflow for large values of start and end.
An integer overflow can cause the value to wrap and become negative, which violates the program's assumption and may lead to unexpected behavior (for example, 8-bit integer addition of 127 + 1 results in −128, a two's complement of 128).
So, the correct way of calculating mid in a binary search is mid = low + ((high - low) / 2) in order to handle overflow errors.
The midpoint is found by adding the lowest position to the highest position and dividing by 2. NOTE - if the answer is a decimal, round up. For example, 3.5 becomes 4. An alternative is to round down, but be consistent.
In the first case you calculate the value (low+high) which might be too huge to fit into an int, if low and high are both huge enough (say if both are equal to 2^30+1 /or even greater/). In the second case, you don't calculate (low+high), you do a small trick and you go through the expression (high-low) and that expression is much safer with respect to int overflow.
Still, if you don't have an array whose size is greater than 2^30 (which is quite a huge array anyway), I don't see how you can run into an int overflow even when using the first expression. So I would just use the first one in most cases and not worry.
Lets assume the worst situation for integer. We are sure that mid = (low + high)/2 can never give a overflow if we look at the bigger picture. Assuming that low = 2ˆ31 - 1 and high = 2ˆ31 - 1 ( the highest integer values) the total calculation will be [2*(2ˆ31-1)] / 2 = 2ˆ31 - 1. ( the biggest number int can hold)
However the way we calculate it matters.
If it is calculated using (low + high)/2 it will simply give an overflow in our situation because first it will try to sum low( 2ˆ31 -1 ) and high ( 2ˆ31 -1 ) and BAM. 2*(2ˆ31-1) is an overflow. It can not divide it by two because it can not store this value initially in an int.
If it is calculated using low + (high-low)/2 it can not give an overflow . Think it that way, low + (high - low)/2 = (low + x) where x = (high - low)/2. In order to create an overflow we should pick big (in our situation biggest for int) numbers for both sides. However if we pick low the biggest, then x will be the smallest( because we are subtracting low from high in x) and vice versa.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With