I have a simple function testing if two arrays are each others inverse. They are seemingly identical, except for a tmp
variable. One works the other doesn't. I can't for the life of me figure out why the compiler would optimize this out - if it indeed is an optimization problem (my compiler is IAR Workbench v4.30.1). Here's my code:
// this works as expected uint8 verifyInverseBuffer(uint8 *buf, uint8 *bufi, uint32 len) { uint8 tmp; for (uint32 i = 0; i < len; i++) { tmp = ~bufi[i]; if (buf[i] != tmp) { return 0; } } return 1; } // this does NOT work as expected (I only removed the tmp!) uint8 verifyInverseBuffer(uint8 *buf, uint8 *bufi, uint32 len) { for (uint32 i = 0; i < len; i++) { if (buf[i] != (~bufi[i])) { return 0; } } return 1; }
The first version of the code works, the second does not. Can anyone figure out why? Or come with some tests to probe what is wrong?
On simple low-cost processors, typically, bitwise operations are substantially faster than division, several times faster than multiplication, and sometimes significantly faster than addition.
This means they look directly at the binary digits or bits of an integer. This all sounds scary, but in truth bitwise operators are quite easy to use. They are really fast as compared to other programming techniques because these operations are carried out directly by the processor.
In the multiplication case, the normal version actually performs about 20% faster than the bitwise equivalent. On the other hand, division is nearly twice as fast with the bitwise shift and the bitwise modulus (really just an & ) is more than three times faster!
Compiler optimization is generally implemented using a sequence of optimizing transformations, algorithms which take a program and transform it to produce a semantically equivalent output program that uses fewer resources or executes faster.
What you see happening is a result of the rules of integer promotions. Anytime a variable smaller than an int
is used in an expression the value is promoted to type int
.
Suppose bufi[i]
contains the value 255. The hex representation of this is 0xFF
. This value is then operand of the ~
operator. So the value will first be promoted to int
which (assuming it is 32 bit) will have the value 0x000000FF
, and applying ~
to this gives you 0xFFFFFF00
. You then compare this value with buf[i]
which is of type uint8_t
. The value 0xFFFFFF00
is outside of this range so the comparison will always be false.
If you assign the result of the ~
back to a variable of type uint8_t
, the value 0xFFFFFF00
is converted to 0x00
. It is this converted value that is then compared against buf[i]
.
So the behavior you see is not the result of an optimization but the rules of the language. Using a temp variable as you are is one way to address this issue. You could also cast the result to uint8
:
if(buf[i] != (uint8)(~bufi[i]))
Or mask out all but the lowest order byte:
if(buf[i] != (~bufi[i] & 0xff))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With