I would like to create a macro or function1mask(n)
which given a number n
returns an unsigned integer with its n
least significant bits set. Although this seems like it should be a basic primitive with heavily discussed implementations which compile efficiently - this doesn't seem to be the case.
Of course, various implementations may have different sizes for the primitive integral types like unsigned int
, so let's assume for the sake of concreteness that we are talking returning a uint64_t
specifically although of course an acceptable solutions would work (with different definitions) for any unsigned integral type. In particular, the solution should be efficient when the type returned is equal to or smaller than the platform's native width.
Critically, this must work for all n
in [0, 64]. In particular mask(0) == 0
and mask(64) == (uint64_t)-1
. Many "obvious" solutions don't work for one of these two cases.
The most important criteria is correctness: only correct solutions which don't rely on undefined behavior are interesting.
The second most important criteria is performance: the idiom should ideally compile to approximately the most efficient platform-specific way to do this on common platforms.
A solution that sacrifices simplicity in the name of performance, e.g., that uses different implementations on different platforms, is fine.
1 The most general case is a function, but ideally it would also work as a macro, without re-evaluating any of its arguments more than once.
To be sure you get the right bit/value: The value at the least significant bit position = x & 1. The value of the isolated least significant 1 = x & -x. The zero-based index of the isolated least significant 1 = log2(x & -x)
This is done by considering a value 'x'. We can perform x|=x<<i for setting a bit. We shift 'a' in the left direction bit by bit, then perform the bitwise operation. To unset the bit, there must be a bit that is already set by the user or default.
range = (((1 << (l - 1)) - 1) ^ ((1 << (r)) - 1)); 2. Now, perform "n = n | range". This will set the bits in the range from l to r in n.
Another solution without branching
unsigned long long mask(unsigned n)
{
return ((1ULL << (n & 0x3F)) & -(n != 64)) - 1;
}
n & 0x3F
keeps the shift amount to maximum 63 in order to avoid UB. In fact most modern architectures will just grab the lower bits of the shift amount, so no and
instruction is needed for this.
The checking condition for 64 can be changed to -(n < 64)
to make it return all ones for n ⩾ 64, which is equivalent to _bzhi_u64(-1ULL, (uint8_t)n)
if your CPU supports BMI2.
The output from Clang looks better than gcc. As it happens gcc emits conditional instructions for MIPS64 and ARM64 but not for x86-64, resulting in longer output
The condition can also be simplified to n >> 6
, utilizing the fact that it'll be one if n = 64. And we can subtract that from the result instead of creating a mask like above
return (1ULL << (n & 0x3F)) - (n == 64) - 1; // or n >= 64
return (1ULL << (n & 0x3F)) - (n >> 6) - 1;
gcc compiles the latter to
mov eax, 1
shlx rax, rax, rdi
shr edi, 6
dec rax
sub rax, rdi
ret
Some more alternatives
return ~((~0ULL << (n & 0x3F)) << (n == 64));
return ((1ULL << (n & 0x3F)) - 1) | (((uint64_t)n >> 6) << 63);
return (uint64_t)(((__uint128_t)1 << n) - 1); // if a 128-bit type is available
A similar question for 32 bits: Set last `n` bits in unsigned int
Try
unsigned long long mask(const unsigned n)
{
assert(n <= 64);
return (n == 64) ? 0xFFFFFFFFFFFFFFFFULL :
(1ULL << n) - 1ULL;
}
There are several great, clever answers that avoid conditionals, but a modern compiler can generate code for this that doesn’t branch.
Your compiler can probably figure out to inline this, but you might be able to give it a hint with inline
or, in C++, constexpr
.
The unsigned long long int
type is guaranteed to be at least 64 bits wide and present on every implementation, which uint64_t
is not.
If you need a macro (because you need something that works as a compile-time constant), that might be:
#define mask(n) ((64U == (n)) ? 0xFFFFFFFFFFFFFFFFULL : (1ULL << (unsigned)(n)) - 1ULL)
As several people correctly reminded me in the comments, 1ULL << 64U
is potential undefined behavior! So, insert a check for that special case.
You could replace 64U
with CHAR_BITS*sizeof(unsigned long long)
if it is important to you to support the full range of that type on an implementation where it is wider than 64 bits.
You could similarly generate this from an unsigned right shift, but you would still need to check n == 64
as a special case, since right-shifting by the width of the type is undefined behavior.
The relevant portion of the (N1570 Draft) standard says, of both left and right bit shifts:
If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
This tripped me up. Thanks again to everyone in the comments who reviewed my code and pointed the bug out to me.
Here's one that is portable and conditional-free:
unsigned long long mask(unsigned n)
{
assert (n <= sizeof(unsigned long long) * CHAR_BIT);
return (1ULL << (n/2) << (n-(n/2))) - 1;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With