I would like to create a macro or function1<code>mask(n)</code> which given a number <code>n</code> returns an unsigned integer with its <code>n</code> least significant bits set. Although this seems like it should be a basic primitive with heavily discussed implementations which compile efficiently - this doesn't seem to be the case. Of course, various implementations may have different sizes for the primitive integral types like <code>unsigned int</code>, so let's assume for the sake of concreteness that we are talking returning a <code>uint64_t</code> specifically although of course an acceptable solutions would work (with different definitions) for any unsigned integral type. In particular, the solution should be efficient when the type returned is equal to or smaller than the platform's native width. Critically, this must work for all <code>n</code> in [0, 64]. In particular <code>mask(0) == 0</code> and <code>mask(64) == (uint64_t)-1</code>. Many "obvious" solutions don't work for one of these two cases. The most important criteria is correctness: only correct solutions which don't rely on undefined behavior are interesting. The second most important criteria is performance: the idiom should ideally compile to approximately the most efficient platform-specific way to do this on common platforms. A solution that sacrifices simplicity in the name of performance, e.g., that uses different implementations on different platforms, is fine. <hr> 1 The most general case is a function, but ideally it would also work as a macro, without re-evaluating any of its arguments more than once.

Another solution without branching <pre class="prettyprint"><code>unsigned long long mask(unsigned n) { return ((1ULL << (n & 0x3F)) & -(n != 64)) - 1; } </code></pre> <code>n & 0x3F</code> keeps the shift amount to maximum 63 in order to avoid UB. In fact most modern architectures will just grab the lower bits of the shift amount, so no <code>and</code> instruction is needed for this. The checking condition for 64 can be changed to <code>-(n < 64)</code> to make it return all ones for n ⩾ 64, which is equivalent to <code>_bzhi_u64(-1ULL, (uint8_t)n)</code> if your CPU supports BMI2. The output from Clang looks better than gcc. As it happens gcc emits conditional instructions for MIPS64 and ARM64 but not for x86-64, resulting in longer output <hr> The condition can also be simplified to <code>n >> 6</code>, utilizing the fact that it'll be one if n = 64. And we can subtract that from the result instead of creating a mask like above <pre class="prettyprint"><code>return (1ULL << (n & 0x3F)) - (n == 64) - 1; // or n >= 64 return (1ULL << (n & 0x3F)) - (n >> 6) - 1; </code></pre> gcc compiles the latter to <pre class="prettyprint"><code>mov eax, 1 shlx rax, rax, rdi shr edi, 6 dec rax sub rax, rdi ret </code></pre> <hr> Some more alternatives <pre class="prettyprint"><code>return ~((~0ULL << (n & 0x3F)) << (n == 64)); return ((1ULL << (n & 0x3F)) - 1) | (((uint64_t)n >> 6) << 63); return (uint64_t)(((__uint128_t)1 << n) - 1); // if a 128-bit type is available </code></pre> A similar question for 32 bits: Set last `n` bits in unsigned int

Try <pre class="prettyprint"><code>unsigned long long mask(const unsigned n) { assert(n <= 64); return (n == 64) ? 0xFFFFFFFFFFFFFFFFULL : (1ULL << n) - 1ULL; } </code></pre> There are several great, clever answers that avoid conditionals, but a modern compiler can generate code for this that doesn’t branch. Your compiler can probably figure out to inline this, but you might be able to give it a hint with <code>inline</code> or, in C++, <code>constexpr</code>. The <code>unsigned long long int</code> type is guaranteed to be at least 64 bits wide and present on every implementation, which <code>uint64_t</code> is not. If you need a macro (because you need something that works as a compile-time constant), that might be: <pre class="prettyprint"><code>#define mask(n) ((64U == (n)) ? 0xFFFFFFFFFFFFFFFFULL : (1ULL << (unsigned)(n)) - 1ULL) </code></pre> As several people correctly reminded me in the comments, <code>1ULL << 64U</code> is potential undefined behavior! So, insert a check for that special case. You could replace <code>64U</code> with <code>CHAR_BITS*sizeof(unsigned long long)</code> if it is important to you to support the full range of that type on an implementation where it is wider than 64 bits. You could similarly generate this from an unsigned right shift, but you would still need to check <code>n == 64</code> as a special case, since right-shifting by the width of the type is undefined behavior. <h3>ETA:</h3> The relevant portion of the (N1570 Draft) standard says, of both left and right bit shifts: <blockquote> If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined. </blockquote> This tripped me up. Thanks again to everyone in the comments who reviewed my code and pointed the bug out to me.

Here's one that is portable and conditional-free: <pre class="prettyprint"><code>unsigned long long mask(unsigned n) { assert (n <= sizeof(unsigned long long) * CHAR_BIT); return (1ULL << (n/2) << (n-(n/2))) - 1; } </code></pre>

Creating a mask with N least significant bits set

Q: How do you change the N bit of a number?

range = (((1 << (l - 1)) - 1) ^ ((1 << (r)) - 1)); 2. Now, perform "n = n | range". This will set the bits in the range from l to r in n.

I would like to create a macro or function¹mask(n) which given a number n returns an unsigned integer with its n least significant bits set. Although this seems like it should be a basic primitive with heavily discussed implementations which compile efficiently - this doesn't seem to be the case.

Of course, various implementations may have different sizes for the primitive integral types like unsigned int, so let's assume for the sake of concreteness that we are talking returning a uint64_t specifically although of course an acceptable solutions would work (with different definitions) for any unsigned integral type. In particular, the solution should be efficient when the type returned is equal to or smaller than the platform's native width.

Critically, this must work for all n in [0, 64]. In particular mask(0) == 0 and mask(64) == (uint64_t)-1. Many "obvious" solutions don't work for one of these two cases.

The most important criteria is correctness: only correct solutions which don't rely on undefined behavior are interesting.

The second most important criteria is performance: the idiom should ideally compile to approximately the most efficient platform-specific way to do this on common platforms.

A solution that sacrifices simplicity in the name of performance, e.g., that uses different implementations on different platforms, is fine.

¹ The most general case is a function, but ideally it would also work as a macro, without re-evaluating any of its arguments more than once.

How do you isolate the least significant bit?

To be sure you get the right bit/value: The value at the least significant bit position = x & 1. The value of the isolated least significant 1 = x & -x. The zero-based index of the isolated least significant 1 = log2(x & -x)

How do you make a bit mask in C++?

This is done by considering a value 'x'. We can perform x|=x<<i for setting a bit. We shift 'a' in the left direction bit by bit, then perform the bitwise operation. To unset the bit, there must be a bit that is already set by the user or default.

How do you change the N bit of a number?

range = (((1 << (l - 1)) - 1) ^ ((1 << (r)) - 1)); 2. Now, perform "n = n | range". This will set the bits in the range from l to r in n.

Another solution without branching

unsigned long long mask(unsigned n)
{
    return ((1ULL << (n & 0x3F)) & -(n != 64)) - 1;
}

n & 0x3F keeps the shift amount to maximum 63 in order to avoid UB. In fact most modern architectures will just grab the lower bits of the shift amount, so no and instruction is needed for this.

The checking condition for 64 can be changed to -(n < 64) to make it return all ones for n ⩾ 64, which is equivalent to _bzhi_u64(-1ULL, (uint8_t)n) if your CPU supports BMI2.

The output from Clang looks better than gcc. As it happens gcc emits conditional instructions for MIPS64 and ARM64 but not for x86-64, resulting in longer output

The condition can also be simplified to n >> 6, utilizing the fact that it'll be one if n = 64. And we can subtract that from the result instead of creating a mask like above

return (1ULL << (n & 0x3F)) - (n == 64) - 1; // or n >= 64
return (1ULL << (n & 0x3F)) - (n >> 6) - 1;

gcc compiles the latter to

mov     eax, 1
shlx    rax, rax, rdi
shr     edi, 6
dec     rax
sub     rax, rdi
ret

Some more alternatives

return ~((~0ULL << (n & 0x3F)) << (n == 64));
return ((1ULL << (n & 0x3F)) - 1) | (((uint64_t)n >> 6) << 63);
return (uint64_t)(((__uint128_t)1 << n) - 1); // if a 128-bit type is available

A similar question for 32 bits: Set last `n` bits in unsigned int

Try

unsigned long long mask(const unsigned n)
{
  assert(n <= 64);
  return (n == 64) ? 0xFFFFFFFFFFFFFFFFULL :
     (1ULL << n) - 1ULL;
}

There are several great, clever answers that avoid conditionals, but a modern compiler can generate code for this that doesn’t branch.

Your compiler can probably figure out to inline this, but you might be able to give it a hint with inline or, in C++, constexpr.

The unsigned long long int type is guaranteed to be at least 64 bits wide and present on every implementation, which uint64_t is not.

If you need a macro (because you need something that works as a compile-time constant), that might be:

#define mask(n) ((64U == (n)) ? 0xFFFFFFFFFFFFFFFFULL : (1ULL << (unsigned)(n)) - 1ULL)

As several people correctly reminded me in the comments, 1ULL << 64U is potential undefined behavior! So, insert a check for that special case.

You could replace 64U with CHAR_BITS*sizeof(unsigned long long) if it is important to you to support the full range of that type on an implementation where it is wider than 64 bits.

You could similarly generate this from an unsigned right shift, but you would still need to check n == 64 as a special case, since right-shifting by the width of the type is undefined behavior.

ETA:

The relevant portion of the (N1570 Draft) standard says, of both left and right bit shifts:

If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.

This tripped me up. Thanks again to everyone in the comments who reviewed my code and pointed the bug out to me.

Here's one that is portable and conditional-free:

unsigned long long mask(unsigned n)
{
    assert (n <= sizeof(unsigned long long) * CHAR_BIT);
    return (1ULL << (n/2) << (n-(n/2))) - 1;
}

Creating a mask with N least significant bits set

Tags:

performance

c

bit-manipulation

bitmask

BeeOnRope

People also ask

3 Answers

phuclv

ETA:

Davislor

n. 1.8e9-where's-my-share m.

Recent Activity

Donate For Us

Creating a mask with N least significant bits set

Tags:

performance

c

bit-manipulation

bitmask

BeeOnRope

People also ask

3 Answers

phuclv

ETA:

Davislor

n. 1.8e9-where's-my-share m.

Related questions

Recent Activity

Donate For Us