this is sort of a follow up on some previous questions on bit manipulation. I modified the code from this site to enumerate strings with K of N bits set (x is the current int64_t
with K bits set, and at the end of this code it is the lexicographically next integer with K bits set):
int64_t b, t, c, m, r,z;
b = x & -x;
t = x + b;
c = x^t;
// was m = (c >> 2)/b per link
z = __builtin_ctz(x);
m = c >> 2+z;
x = t|m;
The modification using __builtin_ctz()
works fine as long as the least significant one bit is in the lower DWORD of x, but if is not, it totally breaks. This can be seen with the following code:
for(int i=0; i<64; i++) printf("i=%i, ctz=%i\n", i, __builtin_ctz(1UL << i));
which prints for GCC version 4.4.7:
i=0, ctz=0
i=1, ctz=1
i=2, ctz=2
...
i=30, ctz=30
i=31, ctz=31
i=32, ctz=0
or for icc version 14.0.0 something similar (except i>32 gives random results, not zero). Using division instead of shifting by 2+z works in both cases, but it's about 5x slower on my Sandy Bridge Xeon. Are there other intrinsics I should be using for 64-bit, or will I have to do some inline assembler?
Thanks!
__builtin_ctz
takes arguments of type unsigned int
, which is 32-bits on most platforms.
If long
is 64 bits, you can use __builtin_ctzl
which takes unsigned long
. Or you can use __builtin_ctzll
which takes unsigned long long
- In this case you should use 1ULL << i
instead of 1UL << i
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With