Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Intrinsic to count trailing zero bits in 64-bit integers?

this is sort of a follow up on some previous questions on bit manipulation. I modified the code from this site to enumerate strings with K of N bits set (x is the current int64_t with K bits set, and at the end of this code it is the lexicographically next integer with K bits set):

int64_t b, t, c, m, r,z;
b = x & -x;
t = x + b;
c = x^t;
// was m = (c >> 2)/b per link
z = __builtin_ctz(x);
m = c >> 2+z;
x = t|m;

The modification using __builtin_ctz() works fine as long as the least significant one bit is in the lower DWORD of x, but if is not, it totally breaks. This can be seen with the following code:

for(int i=0; i<64; i++) printf("i=%i, ctz=%i\n", i, __builtin_ctz(1UL << i));

which prints for GCC version 4.4.7:

i=0, ctz=0
i=1, ctz=1
i=2, ctz=2

...

i=30, ctz=30
i=31, ctz=31
i=32, ctz=0

or for icc version 14.0.0 something similar (except i>32 gives random results, not zero). Using division instead of shifting by 2+z works in both cases, but it's about 5x slower on my Sandy Bridge Xeon. Are there other intrinsics I should be using for 64-bit, or will I have to do some inline assembler?

Thanks!

like image 354
Andrew Avatar asked Dec 23 '13 16:12

Andrew


1 Answers

__builtin_ctz takes arguments of type unsigned int, which is 32-bits on most platforms.

If long is 64 bits, you can use __builtin_ctzl which takes unsigned long. Or you can use __builtin_ctzll which takes unsigned long long - In this case you should use 1ULL << i instead of 1UL << i.

like image 128
interjay Avatar answered Sep 29 '22 06:09

interjay