I was reading this question and it's accepted answer. I read the comments but I couldn't figure out the reason for the optimization produced.
Why does branching occur in assembly code when using the following code?
x >= start && x <= end
EDIT:
For clarity, I want to understand the reason of optimization produced by the accepted answer. That as I understand is the branching present in the assembly code produced by the compiler. I want to understand why is there a branch in the produced code.
Note that the linked question has a fundamentally different expression
x >= start && x++ <= end
It is fundamentally different because the second sub-expression here has a side effect. I'll explain.
Note that &&
is a short-circuit operator. This means that if x >= start
evaluates to false
, the machine can branch over evaluation of x <= end
.
More precisely, when the compiler emits instructions for x >= start && x <= end
, it can emit instructions to branch over x <= end
when x >= start
evaluates to false.
However, I stress the use of the word can in the above statements. The reason for this is because x <= end
has no side-effects and therefore it doesn't matter if the compiler branches over evaluation of it or not.
But, in the case that the second expression does have side effects the compiler must branch over it. Since &&
is a short-circuit operator, in a && b
, if b
has any side effects they must not be observed if a
evaluates to false
; this is a requirement of short-circuiting in C and most (if not all C-like languages).
So, in particular, when you look at
define POINT_IN_RANGE_AND_INCREMENT(p, range)
(p <= range.end && p++ >= range.start)
note that the second expression p++ >= range.start
has a side effect. Namely, (post-)incrementing p
by 1
. But that side effect can only be observed if p <= range.end
evaluates to true
. Thus, the compiler must branch over evaluation of p++ >= range.start
if p <= range.end
evaluates to false.
The reason this results in a branch is because for machine to evaluate that expression, it uses the fact that if p <= range.end
evaluates to false
, then it automatically knows the entire expression evaluates to false
and therefore it should not evaluate p++ >= range.start
because it has a side-effect. Thus, it must branch over evaluating the second part of the expression. So in the assembly:
Ltmp1301:
ldr r1, [sp, #172] @ 4-byte Reload
ldr r1, [r1]
cmp r0, r1
bls LBB44_32
mov r6, r0 @ if the result of the compare is false
b LBB44_33 @ branch over evaluating the second expression
@ to avoid the side effects of p++
LBB44_32:
ldr r1, [sp, #188] @ 4-byte Reload
adds r6, r0, #1
Ltmp1302:
ldr r1, [r1]
cmp r0, r1
bhs LBB44_36
Deep indebtedness to Oli Charlesworth for insightful comments.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With