I have a 64-bit unsigned integer with exactly 1 bit set. I’d like to assign a value to each of the possible 64 values (in this case, the odd primes, so 0x1 corresponds to 3, 0x2 corresponds to 5, …, 0x8000000000000000 corresponds to 313). It seems like the best way would be to convert 1 → 0, 2 → 1, 4 → 2, 8 → 3, …, 263 → 63 and look up the values in an array. But even if that’s so, I’m not sure what the fastest way to get at the binary exponent is. And there may be more efficient ways, still. This operation will be used 1014 to 1016 times, so performance is a serious issue.

Finally an optimal solution. See the end of this section for what to do when the input is guaranteed to have exactly one non-zero bit: http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogDeBruijn Here's the code: <pre class="prettyprint"><code>static const int MultiplyDeBruijnBitPosition2[32] = { 0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8, 31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9 }; r = MultiplyDeBruijnBitPosition2[(uint32_t)(v * 0x077CB531U) >> 27]; </code></pre> You may be able to adapt this to a direct multiplication-based algorithm for 64-bit inputs; otherwise, simply add one conditional to see if the bit is in the upper 32 positions or the lower 32 positions, then use the 32-bit algorithm here. Update: Here's at least one 64-bit version I just developed myself, but it uses division (actually modulo). <pre class="prettyprint"><code>r = Table[v%67]; </code></pre> For each power of 2, <code>v%67</code> has a distinct value, so just put your odd primes (or bit indices if you don't want the odd-prime thing) at the right positions in the table. 3 positions (0, 17, and 34) are not used, which might be convenient if you also want to accept all-bits-zero as an input. Update 2: 64-bit version. <pre class="prettyprint"><code>r = Table[(uint64_t)(val * 0x022fdd63cc95386dull) >> 58]; </code></pre> This is my original work, but I got the <code>B(2,6)</code> De Bruijn sequence from this chess site so I can't take credit for anything but figuring out what a De Bruijn sequence is and using Google. ;-) Some additional remarks on how this works: The magic number is a <code>B(2,6)</code> De Bruijn sequence. It has the property that, if you look at a 6-consecutive-bit window, you can obtain any six-bit value in that window by rotating the number appropriately, and that each possible six-bit value is obtained by exactly one rotation. We fix the window in question to be the top 6 bit positions, and choose a De Bruijn sequence with 0's in the top 6 bits. This makes it so we never have to deal with bit rotations, only shifts, since 0's will come into the bottom bits naturally (and we could never end up looking at more than 5 bits from the bottom in the top-6-bits window). Now, the input value of this function is a power of 2. So multiplying the De Bruijn sequence by the input value performs a bitshift by <code>log2(value)</code> bits. We now have in the upper 6 bits a number which uniquely determines how many bits we shifted by, and can use that as an index into a table to get the actual length of the shift. This same approach can be used for arbitrarily-large or arbitrarily-small integers, as long as you're willing to implement the multiplication. You simply have to find a <code>B(2,k)</code> De Bruijn sequence where <code>k</code> is the number of bits. The chess wiki link I provided above has De Bruijn sequences for values of <code>k</code> ranging from 1 to 6, and some quick Googling shows there are a few papers on optimal algorithms for generating them in the general case.

If performance is a serious issue, then you should use intrinsics/builtins to use CPU specific instructions, such as the ones found here for GCC: http://gcc.gnu.org/onlinedocs/gcc-4.5.0/gcc/Other-Builtins.html <ul> <li> Built-in function <code>int __builtin_ffs(unsigned int x)</code>. Returns one plus the index of the least significant 1-bit of x, or if x is zero, returns zero. </li> <li> Built-in function <code>int __builtin_clz(unsigned int x)</code>. Returns the number of leading 0-bits in x, starting at the most significant bit position. If x is 0, the result is undefined. </li> <li> Built-in function <code>int __builtin_ctz(unsigned int x)</code>. Returns the number of trailing 0-bits in x, starting at the least significant bit position. If x is 0, the result is undefined. </li> </ul> Things like this are the core of many O(1) algorithms, such as kernel schedulers which need to find the first non-empty queue signified by an array of bits. Note: I’ve listed the <code>unsigned int</code> versions, but GCC has <code>unsigned long long</code> versions, as well.

Bit twiddling: which bit is set?

Q: What is bit twiddling function?

Bit-twiddling version: boolean isPowerOf2(int x) { return x > 0 && (x & (x - 1)) == 0; } This function returns true if x == 1, 2, 4, 8, 16, ..., 1073741824 but false otherwise. To start, the condition x > 0 excludes numbers that cannot possibly be a positive integral power of 2.

Tags:

c

bit-manipulation

I have a 64-bit unsigned integer with exactly 1 bit set. I’d like to assign a value to each of the possible 64 values (in this case, the odd primes, so 0x1 corresponds to 3, 0x2 corresponds to 5, …, 0x8000000000000000 corresponds to 313).

It seems like the best way would be to convert 1 → 0, 2 → 1, 4 → 2, 8 → 3, …, 2⁶³ → 63 and look up the values in an array. But even if that’s so, I’m not sure what the fastest way to get at the binary exponent is. And there may be more efficient ways, still.

This operation will be used 10¹⁴ to 10¹⁶ times, so performance is a serious issue.

682

asked Aug 12 '10 06:08

Charles

2 Answers

Finally an optimal solution. See the end of this section for what to do when the input is guaranteed to have exactly one non-zero bit: http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogDeBruijn

Here's the code:

static const int MultiplyDeBruijnBitPosition2[32] =  {   0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8,    31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9 }; r = MultiplyDeBruijnBitPosition2[(uint32_t)(v * 0x077CB531U) >> 27];

You may be able to adapt this to a direct multiplication-based algorithm for 64-bit inputs; otherwise, simply add one conditional to see if the bit is in the upper 32 positions or the lower 32 positions, then use the 32-bit algorithm here.

Update: Here's at least one 64-bit version I just developed myself, but it uses division (actually modulo).

r = Table[v%67];

For each power of 2, v%67 has a distinct value, so just put your odd primes (or bit indices if you don't want the odd-prime thing) at the right positions in the table. 3 positions (0, 17, and 34) are not used, which might be convenient if you also want to accept all-bits-zero as an input.

Update 2: 64-bit version.

r = Table[(uint64_t)(val * 0x022fdd63cc95386dull) >> 58];

This is my original work, but I got the B(2,6) De Bruijn sequence from this chess site so I can't take credit for anything but figuring out what a De Bruijn sequence is and using Google. ;-)

Some additional remarks on how this works:

The magic number is a B(2,6) De Bruijn sequence. It has the property that, if you look at a 6-consecutive-bit window, you can obtain any six-bit value in that window by rotating the number appropriately, and that each possible six-bit value is obtained by exactly one rotation.

We fix the window in question to be the top 6 bit positions, and choose a De Bruijn sequence with 0's in the top 6 bits. This makes it so we never have to deal with bit rotations, only shifts, since 0's will come into the bottom bits naturally (and we could never end up looking at more than 5 bits from the bottom in the top-6-bits window).

Now, the input value of this function is a power of 2. So multiplying the De Bruijn sequence by the input value performs a bitshift by log2(value) bits. We now have in the upper 6 bits a number which uniquely determines how many bits we shifted by, and can use that as an index into a table to get the actual length of the shift.

This same approach can be used for arbitrarily-large or arbitrarily-small integers, as long as you're willing to implement the multiplication. You simply have to find a B(2,k) De Bruijn sequence where k is the number of bits. The chess wiki link I provided above has De Bruijn sequences for values of k ranging from 1 to 6, and some quick Googling shows there are a few papers on optimal algorithms for generating them in the general case.

153

answered Sep 28 '22 16:09

R.. GitHub STOP HELPING ICE

If performance is a serious issue, then you should use intrinsics/builtins to use CPU specific instructions, such as the ones found here for GCC:

http://gcc.gnu.org/onlinedocs/gcc-4.5.0/gcc/Other-Builtins.html

Built-in function int __builtin_ffs(unsigned int x).

Returns one plus the index of the least significant 1-bit of x, or if x is zero, returns zero.
Built-in function int __builtin_clz(unsigned int x).

Returns the number of leading 0-bits in x, starting at the most significant bit position. If x is 0, the result is undefined.
Built-in function int __builtin_ctz(unsigned int x).

Returns the number of trailing 0-bits in x, starting at the least significant bit position. If x is 0, the result is undefined.

Things like this are the core of many O(1) algorithms, such as kernel schedulers which need to find the first non-empty queue signified by an array of bits.

Note: I’ve listed the unsigned int versions, but GCC has unsigned long long versions, as well.

answered Sep 28 '22 15:09

Evan Teran

Related questions
                            
                                When to use extern "C" in simple words? [duplicate]
                            
                                Socket programming - What's the difference between listen() and accept()?
                            
                                What API do I call to get the system uptime?
                            
                                Fixed Point Arithmetic in C Programming
                            
                                How can I access a shadowed global variable in C?
                            
                                Determining the complexities given codes
                            
                                Ctrl + C interrupt event handling in Linux
                            
                                Bit manipulations good practices
                            
                                Pointers as function arguments in C
                            
                                Do jagged arrays exist in C/C++?
                            
                                What does static mean in ANSI-C [duplicate]
                            
                                Maximum number of parameters in function declaration
                            
                                K&R Exercise 1-9: output the input, replacing multiple blanks by a single blank
                            
                                How to overwrite stdout in C
                            
                                Why can't we use double pointer to represent two dimensional arrays?
                            
                                passing string array from java to C with JNI
                            
                                C puts() without newline
                            
                                Is there any alternative to using % (modulus) in C/C++?
                            
                                x86_64 ASM - maximum bytes for an instruction?
                            
                                When should we use asserts in C?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With