I have the following C/C++ code snippet:
#define ARRAY_LENGTH 666
int g_sum = 0;
extern int *g_ptrArray[ ARRAY_LENGTH ];
void test()
{
unsigned int idx = 0;
// either enable or disable the check "idx < ARRAY_LENGTH" in the while loop
while( g_ptrArray[ idx ] != nullptr /* && idx < ARRAY_LENGTH */ )
{
g_sum += *g_ptrArray[ idx ];
++idx;
}
return;
}
When I compile the above code using GCC compiler in version 12.2.0 with the option -Os
for both cases:
g_ptrArray[ idx ] != nullptr
g_ptrArray[ idx ] != nullptr && idx < ARRAY_LENGTH
I get the following assembly:
test():
ldr r2, .L4
ldr r1, .L4+4
.L2:
ldr r3, [r2], #4
cbnz r3, .L3
bx lr
.L3:
ldr r3, [r3]
ldr r0, [r1]
add r3, r3, r0
str r3, [r1]
b .L2
.L4:
.word g_ptrArray
.word .LANCHOR0
g_sum:
.space 4
As you can see the assembly does !NOT! do any checking of the variable idx
against the value ARRAY_LENGTH
.
How is that possible?
How can the compiler generate the exactly same assembly for both cases and ignore the idx < ARRAY_LENGTH
condition if it is present in the code? Explain me the rule or procedure, how the compiler comes to the conclusion that it can completely remove the condition.
The output assembly shown in Compiler Explorer (see both assemblies are identical):
the while condition is g_ptrArray[ idx ] != nullptr
:
the while condition is g_ptrArray[ idx ] != nullptr && idx < ARRAY_LENGTH
:
NOTE: If I swap the order of conditions to be idx < ARRAY_LENGTH && g_ptrArray[ idx ] != nullptr
, the output assembly contains the check for value of idx
as you can see here: https://godbolt.org/z/fvbsTfr9P.
Accessing an array out of bounds is undefined behavior so the compiler can assume that it never happens in the LHS of the &&
expression. It is then jumping through hoops (optimizations) to notice that since ARRAY_LENGTH
is the length of the array, the RHS condition must necessarily hold true (otherwise UB would ensue in the LHS). Hence the result you see.
The correct check would be idx < ARRAY_LENGTH && g_ptrArray[idx] != nullptr
. This would avoid any possibility of undefined behavior on the RHS since the LHS has to be evaluated first, and the RHS is not evaluated unless the LHS is true (in C and C++ the &&
operator is guaranteed to behave this way).
Even potential undefined behavior can do nasty things like that!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With