I'm working with some bigint public-key cryptography code. Is it safe to use bitwise masking to ensure that the calculation timing and memory addresses accessed are independent of the data values?
Is this technique vulnerable to side-channel attacks based on instruction timing, power, RF emissions, or other things I'm unaware of? (For reference, I'm aware of techniques like RSA blinding, EC Montgomery ladder, cache flushing, and such.)
Example of straightforward code (C/C++):
uint a = (...), b = (...);
if (a < b)
a += b;
Now translated to use constant-time masking:
uint a = (...), b = (...);
uint mask = -(uint)(a < b);
a = ((a + b) & mask) | (a & ~mask);
Note that a < b
is 0 or 1, and mask is 0x00000000 or 0xFFFFFFFF.
Similarly, for a high-level operation (C++):
Integer x = (...);
if (x.isFoo())
x.doBar();
Is the following an acceptable safe translation?
Integer x = (...);
uint mask = -(uint)x.isFoo(); // Assume this is constant-time
Integer y(x); // Copy constructor
y.doBar(); // Assume this is constant-time
x.replace(y, mask); // Assume this uses masking
This technique may be safe... if the operations we assume to take constant time really do, and if the compiler doesn't change the code to do something else instead.
In particular, let's take a look at your first example:
uint a = (...), b = (...);
uint mask = -(uint)(a < b);
a = ((a + b) & mask) | (a & ~mask);
I see two somewhat plausible ways in which this could fail to run in constant time:
The comparison a < b
might or might not take constant time, depending on the compiler (and CPU). If it's compiled to simple bit manipulation, it may be constant-time; if it's compiled to use a conditional jump, it may well not be.
At high optimization levels, it's possible that a too-clever compiler might detect what's happening (say, by splitting the code into two paths based on the comparison, and optimizing them separately before merging them back) and "optimize" it back into the non-constant time code we were trying to avoid.
(Of course, it's also possible that a sufficiently clever compiler could optimize the naïve, seemingly non-constant time code into a constant-time operation, if it thought that would be more efficient!)
One possible way to avoid the first issue would be to replace the comparison with explicit bit manipulation, as in:
uint32_t a = (...), b = (...);
uint32_t mask = -((a - b) >> 31);
a = ((a + b) & mask) | (a & ~mask);
However, note that this is only equivalent to your original code if we can be sure that a
and b
differ by less than 231. If that is not guaranteed, we'd have to cast the variables into a longer type before the subtraction, e.g.:
uint32_t mask = (uint32_t)(( (uint64_t)a - (uint64_t)b ) >> 32);
All that said, even this is not foolproof, as the compiler could still decide to turn this code into something that is not constant-time. (For instance, 64-bit subtraction on a 32-bit CPU could potentially take variable time depending on whether there's a borrow or not — which is precisely what we're trying to hide, here.)
In general, the only way to make sure that such timing leaks don't occur is to:
inspect the generated assembly code manually (e.g. looking for jump instructions where you didn't expect any), and
actually benchmark the code to verify that it does, indeed, take the same time to run regardless of the inputs.
Obviously, you'll also need to do this separately for each combination of compiler and target platform that you wish to support.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With