Using unsigned int
s it's possible to round to a power of 2 like this:
unsigned int power_of_2_max_u(unsigned int x)
{
x -= 1;
x = x | (x >> 1);
x = x | (x >> 2);
x = x | (x >> 4);
x = x | (x >> 8);
x = x | (x >>16);
return x + 1;
}
or…
int is_power_of_2_i(int n)
{
return (n & (n - 1)) == 0;
}
int power_of_2_max_i(int n)
{
if (is_power_of_2_i(n))
return n;
do {
n = n & (n - 1);
} while (!is_power_of_2_i(n));
return n * 2;
}
However when the value is a constant, it should be possible to use the preprocessor to avoid having to calculate the value every time. e. g.:
i = power_of_2_max_u(sizeof(SomeStruct));
This is a macro which is equivalent to power_of_2_max_u
.
#define POW2_CEIL(v) (1 + \
(((((((((v) - 1) | (((v) - 1) >> 0x10) | \
(((v) - 1) | (((v) - 1) >> 0x10) >> 0x08)) | \
((((v) - 1) | (((v) - 1) >> 0x10) | \
(((v) - 1) | (((v) - 1) >> 0x10) >> 0x08)) >> 0x04))) | \
((((((v) - 1) | (((v) - 1) >> 0x10) | \
(((v) - 1) | (((v) - 1) >> 0x10) >> 0x08)) | \
((((v) - 1) | (((v) - 1) >> 0x10) | \
(((v) - 1) | (((v) - 1) >> 0x10) >> 0x08)) >> 0x04))) >> 0x02))) | \
((((((((v) - 1) | (((v) - 1) >> 0x10) | \
(((v) - 1) | (((v) - 1) >> 0x10) >> 0x08)) | \
((((v) - 1) | (((v) - 1) >> 0x10) | \
(((v) - 1) | (((v) - 1) >> 0x10) >> 0x08)) >> 0x04))) | \
((((((v) - 1) | (((v) - 1) >> 0x10) | \
(((v) - 1) | (((v) - 1) >> 0x10) >> 0x08)) | \
((((v) - 1) | (((v) - 1) >> 0x10) | \
(((v) - 1) | (((v) - 1) >> 0x10) >> 0x08)) >> 0x04))) >> 0x02))) >> 0x01))))
Generated by this Python-3 script.
import math
BITS = 32
data = (["(v - 1)"] +
["(v | (v >> 0x%02x))" % i
for i in reversed([2 ** j
for j in range(int(math.log2(BITS)))])])
macro = "1 + v"
for l in reversed(data):
macro = macro.replace('v', '(%s)' % l)
print("#define POW2_CEIL(v)", macro)
Using statement expressions it can be made a little nicer, but this relies on a GCC extension.
#define POW2_CEIL(x_) ({ \
unsigned int x = x_; \
x -= 1; \
x = x | (x >> 1); \
x = x | (x >> 2); \
x = x | (x >> 4); \
x = x | (x >> 8); \
x = x | (x >>16); \
x + 1; })
While this is a fairly ugly macro, I double checked and the compiler does reduce this correctly to a constant (as it should) so unsigned int a = 8;
is exactly equivalent to unsigned int a = POW2_CEIL(7);
,
however I was interested to know if there was a better way to do this (possibly not limited to an int range).
However when the value is a constant, it should be possible to use the preprocessor to avoid having to calculate the value every time.
Readability >>> performance. First of all, ask yourself the question" "is this a real bottleneck in my code?"
If the answer is "no", then don't do anything; move on without thinking about micro-"optimizing" your code prematurely.
Even if this is an important part of your code, consider that your functions are pure. Their return value only depends on their arguments, since they perform simple computations (a mapping, basically). Chances are, any decent optimizing compiler will constant-fold your functions when called with expressions known at compile-time.
For example, the following code…
#include <stdio.h>
static unsigned power_of_2_max_u(unsigned x)
{
x -= 1;
x = x | (x >> 1);
x = x | (x >> 2);
x = x | (x >> 4);
x = x | (x >> 8);
x = x | (x >>16);
return x + 1;
}
struct Foo {
int x;
char c;
int i;
};
int main()
{
printf("%zu %u\n", sizeof(struct Foo), power_of_2_max_u(sizeof(struct Foo)));
return 0;
}
…is translated to the following assembly by clang -O2 -flto
:
/tmp/constfold:
(__TEXT,__text) section
_main:
0000000100000f50 pushq %rbp
0000000100000f51 movq %rsp, %rbp
0000000100000f54 leaq 0x37(%rip), %rdi ## literal pool for: "%zu %u\n"
0000000100000f5b movl $0xc, %esi ## *** sizeof(struct Foo) == 12
0000000100000f60 movl $0x10, %edx ## *** rounded up == 16
0000000100000f65 xorl %eax, %eax
0000000100000f67 callq 0x100000f70 ## symbol stub for: _printf
0000000100000f6c xorl %eax, %eax
0000000100000f6e popq %rbp
0000000100000f6f retq
So, you didn't need to do anything for the compiler to constant-fold your function.
Only consider re-writing your functions as one of the macros above (or using non-portable compiler intrinsics) if you can observe that they are not being constant-folded.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With