There are three reasons.
First of all, start + (end - start) / 2
works even if you are using pointers, as long as end - start
doesn't overflow1.
int *start = ..., *end = ...;
int *mid = start + (end - start) / 2; // works as expected
int *mid = (start + end) / 2; // type error, won't compile
Second of all, start + (end - start) / 2
won't overflow if start
and end
are large positive numbers. With signed operands, overflow is undefined:
int start = 0x7ffffffe, end = 0x7fffffff;
int mid = start + (end - start) / 2; // works as expected
int mid = (start + end) / 2; // overflow... undefined
(Note that end - start
may overflow, but only if start < 0
or end < 0
.)
Or with unsigned arithmetic, overflow is defined but gives you the wrong answer. However, for unsigned operands, start + (end - start) / 2
will never overflow as long as end >= start
.
unsigned start = 0xfffffffeu, end = 0xffffffffu;
unsigned mid = start + (end - start) / 2; // works as expected
unsigned mid = (start + end) / 2; // mid = 0x7ffffffe
Finally, you often want to round towards the start
element.
int start = -3, end = 0;
int mid = start + (end - start) / 2; // -2, closer to start
int mid = (start + end) / 2; // -1, surprise!
1 According to the C standard, if the result of pointer subtraction is not representable as a ptrdiff_t
, then the behavior is undefined. However, in practice, this requires allocating a char
array using at least half the entire address space.
We can take a simple example to demonstrate this fact. Suppose in a certain large array, we are trying to find the midpoint of the range [1000, INT_MAX]
. Now, INT_MAX
is the largest value the int
data type can store. Even if 1
is added to this, the final value will become negative.
Also, start = 1000
and end = INT_MAX
.
Using the formula: (start + end)/2
,
the mid-point will be
(1000 + INT_MAX)/2
=-(INT_MAX+999)/2
, which is negative and may give segmentation fault if we try to index using this value.
But, using the formula, (start + (end-start)/2)
, we get:
(1000 + (INT_MAX-1000)/2)
=(1000 + INT_MAX/2 - 500)
=(INT_MAX/2 + 500)
which will not overflow.
To add to what others have already said, the first one explains its meaning clearer to those less mathematically minded:
mid = start + (end - start) / 2
reads as:
mid equals start plus half of the length.
whereas:
mid = (start + end) / 2
reads as:
mid equals half of start plus end
Which does not seem as clear as the first, at least when expressed like that.
as Kos pointed out it can also read:
mid equals the average of start and end
Which is clearer but still not, at least in my opinion, as clear as the first.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With