Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Rounding up to powers of 2 (with preprocessor constants)

Tags:

c

macros

Using unsigned ints it's possible to round to a power of 2 like this:

unsigned int power_of_2_max_u(unsigned int x)
{
    x -= 1;
    x = x | (x >> 1);
    x = x | (x >> 2);
    x = x | (x >> 4);
    x = x | (x >> 8);
    x = x | (x >>16);
    return x + 1;
}

or…

int is_power_of_2_i(int n)
{
    return (n & (n - 1)) == 0;
}

int power_of_2_max_i(int n)
{
    if (is_power_of_2_i(n))
        return n;

    do {
        n = n & (n - 1);
    } while (!is_power_of_2_i(n));

    return n * 2;
}

However when the value is a constant, it should be possible to use the preprocessor to avoid having to calculate the value every time. e. g.:

i = power_of_2_max_u(sizeof(SomeStruct));

This is a macro which is equivalent to power_of_2_max_u.

#define POW2_CEIL(v) (1 + \
(((((((((v) - 1) | (((v) - 1) >> 0x10) | \
      (((v) - 1) | (((v) - 1) >> 0x10) >> 0x08)) | \
     ((((v) - 1) | (((v) - 1) >> 0x10) | \
      (((v) - 1) | (((v) - 1) >> 0x10) >> 0x08)) >> 0x04))) | \
   ((((((v) - 1) | (((v) - 1) >> 0x10) | \
      (((v) - 1) | (((v) - 1) >> 0x10) >> 0x08)) | \
     ((((v) - 1) | (((v) - 1) >> 0x10) | \
      (((v) - 1) | (((v) - 1) >> 0x10) >> 0x08)) >> 0x04))) >> 0x02))) | \
 ((((((((v) - 1) | (((v) - 1) >> 0x10) | \
      (((v) - 1) | (((v) - 1) >> 0x10) >> 0x08)) | \
     ((((v) - 1) | (((v) - 1) >> 0x10) | \
      (((v) - 1) | (((v) - 1) >> 0x10) >> 0x08)) >> 0x04))) | \
   ((((((v) - 1) | (((v) - 1) >> 0x10) | \
      (((v) - 1) | (((v) - 1) >> 0x10) >> 0x08)) | \
     ((((v) - 1) | (((v) - 1) >> 0x10) | \
      (((v) - 1) | (((v) - 1) >> 0x10) >> 0x08)) >> 0x04))) >> 0x02))) >> 0x01))))

Generated by this Python-3 script.

import math
BITS = 32
data = (["(v - 1)"] +
        ["(v | (v >> 0x%02x))" % i
         for i in reversed([2 ** j
         for j in range(int(math.log2(BITS)))])])
macro = "1 + v"
for l in reversed(data):
    macro = macro.replace('v', '(%s)' % l)
print("#define POW2_CEIL(v)", macro) 

Using statement expressions it can be made a little nicer, but this relies on a GCC extension.

#define POW2_CEIL(x_) ({      \
    unsigned int x = x_; \
    x -= 1;              \
    x = x | (x >> 1);    \
    x = x | (x >> 2);    \
    x = x | (x >> 4);    \
    x = x | (x >> 8);    \
    x = x | (x >>16);    \
    x + 1; })

While this is a fairly ugly macro, I double checked and the compiler does reduce this correctly to a constant (as it should) so unsigned int a = 8; is exactly equivalent to unsigned int a = POW2_CEIL(7);, however I was interested to know if there was a better way to do this (possibly not limited to an int range).

like image 790
ideasman42 Avatar asked Oct 20 '22 10:10

ideasman42


1 Answers

However when the value is a constant, it should be possible to use the preprocessor to avoid having to calculate the value every time.

Readability >>> performance. First of all, ask yourself the question" "is this a real bottleneck in my code?"

If the answer is "no", then don't do anything; move on without thinking about micro-"optimizing" your code prematurely.

Even if this is an important part of your code, consider that your functions are pure. Their return value only depends on their arguments, since they perform simple computations (a mapping, basically). Chances are, any decent optimizing compiler will constant-fold your functions when called with expressions known at compile-time.

For example, the following code…

#include <stdio.h>

static unsigned power_of_2_max_u(unsigned x)
{
    x -= 1;
    x = x | (x >> 1);
    x = x | (x >> 2);
    x = x | (x >> 4);
    x = x | (x >> 8);
    x = x | (x >>16);
    return x + 1;
}

struct Foo {
    int x;
    char c;
    int i;
};

int main()
{
    printf("%zu %u\n", sizeof(struct Foo), power_of_2_max_u(sizeof(struct Foo)));

    return 0;
}

…is translated to the following assembly by clang -O2 -flto:

/tmp/constfold:
(__TEXT,__text) section
_main:
0000000100000f50    pushq   %rbp
0000000100000f51    movq    %rsp, %rbp
0000000100000f54    leaq    0x37(%rip), %rdi  ## literal pool for: "%zu %u\n"
0000000100000f5b    movl    $0xc, %esi        ## *** sizeof(struct Foo) == 12
0000000100000f60    movl    $0x10, %edx       ## *** rounded up == 16
0000000100000f65    xorl    %eax, %eax
0000000100000f67    callq   0x100000f70       ## symbol stub for: _printf
0000000100000f6c    xorl    %eax, %eax
0000000100000f6e    popq    %rbp
0000000100000f6f    retq

So, you didn't need to do anything for the compiler to constant-fold your function.

Only consider re-writing your functions as one of the macros above (or using non-portable compiler intrinsics) if you can observe that they are not being constant-folded.

like image 153
The Paramagnetic Croissant Avatar answered Oct 23 '22 02:10

The Paramagnetic Croissant