On ARM Cortex-M3, for example, CLZ instruction is present, it counts leading zeros of an 32 bit integer. If integer is zero, result is 32.
In gcc, on the other hand, I can use a __builtin_clz function. However, according to gcc documentation:
Built-in Function: int __builtin_clz (unsigned int x).
Returns the number of leading 0-bits in x, starting at the most significant bit position. If x is 0, the result is undefined.
So if using this builtin I should manually handle zero? Or is it guaranted to be compiled to CLZ instruction if such an instruction is present on target machine?
Quotes from gcc documentation are highly appritiated!
The builtins provide the functions described for them. They are not guaranteed to compile to specific instructions.
Note that the GCC documentation says these are functions: “GCC provides a large number of built-in functions.” It does not tell you they generate specific instructions. For __builtin_clz
, it says “Returns the number of leading 0-bits in x, starting at the most significant bit position. If x is 0, the result is undefined.” The documentation here simply means what it says: __builtin_clz
is a function that returns the number of leading 0-bits in x, starting at the most significant bit position, and, if x is 0, the result is undefined. There is no statement in the documentation that __builtin_clz
provides a count-leading-zeros instruction, so you cannot expect that it provides a count-leading-zeros instruction.
The compiler is free to implement builtins in any way that provides the specified functions. Since the specified function has undefined behavior for zero, you cannot expect the compiler will provide your desired behavior for zero, via a clz
instruction or otherwise.
We can expect that optimization will generally use the obvious instructions when suitable. But the compiler may also combine the builtin functions with other code (possibly resulting in a code sequence in which the usual instruction is not needed or another instruction is better), evaluate constant expressions at compile-time, and make other non-obvious optimizations. Consider that, if the compiler recognizes a code path in which the argument to __builtin_clz
is zero, it may replace that code path with anything, including deleting it entirely, since its behavior is undefined.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With