I have some code like the following code block (I am not allowed to post the original code) inside a .cpp
file that I think is being compiled by clang++
(Ubuntu clang version 3.5.2-3ubuntu1 (tags/RELEASE_352/final) (based on LLVM 3.5.2)
).
It looks like C
code, because we are using GoogleTest
for testing our C
code. Anyways:
size_t const SHIFT = 4;
uint8_t var, var2;
/* Omitted: Code that sets var to, say 00011000 (base 2) */
var2 = var;
var = var << SHIFT >> SHIFT; // [1] result is 00011000 (base 2) (tested with printf)
var2 = var2 << SHIFT;
var2 = var2 >> SHIFT; // [2] result is 00001000 (base 2) (tested with printf)
Now, why does comment [1]
hold true? I was assuming that the corresponding line would result in the top 4 bits being zeroed out. But I found out that this isn't true; the program simply restores the original value.
Is this some language defined behavior, or is
clang
compiling out a supposedly useless bit shift?
(I checked associativity (using this table on cppreference.com, assuming that associativity/precedence of basic operators won't differ between versions of C++
and probably not between C++
and C
either, at least not in 'current versions') and it seems like the RHS expression at [1]
should really yield the same result as the two following statements)
Compiler optimization is generally implemented using a sequence of optimizing transformations, algorithms which take a program and transform it to produce a semantically equivalent output program that uses fewer resources or executes faster.
Compilers are free to optimize code so long as they can guarantee the semantics of the code are not changed. I would suggestion starting at the Compiler optimization wikipedia page as there are many different kinds of optimization that are performed at many different stages.
Yes, compilers generate the most optimal code for such simplistic calculations.
The code optimization in the synthesis phase is a program transformation technique, which tries to improve the intermediate code by making it consume fewer resources (i.e. CPU, Memory) so that faster-running machine code will result.
What you are seeing is a result of integer promotions. Any value with a type of lower rank than int
, when used in an expression, is promoted to int
.
This is detail in section 6.3.1.1 of the C standard
2 The following may be used in an expression wherever an
int
orunsigned int
may be used:
- An object or expression with an integer type (other than
int
orunsigned int
) whose integer conversion rank is less than or equal to the rank ofint
andunsigned int
.- A bit-field of type
_Bool
,int
,signed int
, orunsigned int
.If an
int
can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to anint
; otherwise, it is converted to anunsigned int
. These are called the integer promotions. All other types are unchanged by the integer promotions.
In this case:
var = var << SHIFT >> SHIFT;
var
is first promoted to int
. This type is at least 16 bits wide, and most likely 32 bits wide. So the value being operated on is 0x00000018
. A left shift of 4 results in 0x00000180
, and subsequently right shifting results in 0x00000018
.
The result is then stored in a uint_8
. Since the value fits in a variable of this type, no conversion is required and 0x18
is stored.
The expression:
var = var << SHIFT >> SHIFT;
is not semantically equivelent to
var = var << SHIFT ;
var = var >> SHIFT ;
That would require:
var = (uint8_t)(var << SHIFT) >> SHIFT ;
which illustrates the effective difference between the two - the observed behaviour does not require optimisation, rather it is required by language definition with respect to type promotion rules in expressions.
That said, it is also entirely possible that a compiler could optimise out the shifts. Optimisation may not change the result where the behaviour is defined as it is in this case.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With