Today I thought: well, even if there is great suspicion on RDRAND implementation of NIST SP 800-90A, it is still a hardware implementation of pseudo-random number generator (PRNG) that must be good enough for non-sensitive applications. So I thought of using it on my game instead of Mersenne Twister.
So, to see if there was any performance gain on using the instruction, I compared the time of the two following codes:
// test.cpp
#include <cstdio>
int main()
{
unsigned int rnd = 0;
for(int i = 0; i < 10000000; ++i) {
__builtin_ia32_rdrand32_step(&rnd);
}
printf("%x\n", rnd);
}
and
//test2.cpp
#include <cstdio>
#include <random>
int main()
{
unsigned int rnd = 0;
__builtin_ia32_rdrand32_step(&rnd);
std::mt19937 gen(rnd);
for(int i = 0; i < 10000000; ++i) {
rnd ^= gen();
}
printf("%x\n", rnd);
}
and by running the two I get:
$ time ./test
d230449a
real 0m0.361s
user 0m0.358s
sys 0m0.002s
$ time ./test2
bfc4e472
real 0m0.051s
user 0m0.050s
sys 0m0.002s
So, Mersenne Twister is much faster than RDRAND on my CPU. Well, I was disappointed, ruled out from my game. But RDRAND is a cryptographically secure PRNG (CSPRNG), so it does much behind the scenes... more fair would be compare it to other CSPRNG. So I took my Rabbit implementation (plain translation of the RFC to C, no fancy tricks for performance), and wrote the following test:
// test3.cpp
#include <cstdio>
extern "C"
{
#include "rabbit.h"
}
int main()
{
rabbit_state s;
unsigned long long buf[2];
__builtin_ia32_rdrand64_step(&buf[0]);
__builtin_ia32_rdrand64_step(&buf[1]);
rabbit_init_key(&s, (uint8_t*)&buf[0]);
for(int i = 0; i < 10000000; ++i) {
rabbit_extract(&s, (uint8_t*)&buf[0]);
}
printf("%llx\n", buf[0]);
}
And for my surprise, generating twice as much pseudo-random data as the first two of them, I got a better time than RDRAND:
$ time ./test3
8ef9772277b70aba
real 0m0.344s
user 0m0.341s
sys 0m0.002s
All three were compiled with optimization enabled.
So, we have a widespread paranoia that RDRAND was made to embed NSA backdoors into everybody's software cryptography. Also we have at least one software CSPRNG faster than RDRAND, and the most widely used decent PRNG, Mersenne Twister, is much faster than RDRAND. Finally, we have open-source auditable software entropy pools, like /dev/random
and /dev/urandom
, that are not hidden behind twofold scrambler layers of AES, like RDRAND.
So, the question: should people be using RDRAND? Is there any legitimate use for it? Or should we stop using it altogether?
( RDRAND is available in Ivy Bridge processors and is part of the Intel 64 and IA-32 instruction set architectures.) The random number generator is compliant with security and cryptographic standards such as NIST SP 800-90A, FIPS 140-2, and ANSI X9.
Intel RDRAND, a silicon-based TRNG, is supported by wolfSSL. Additionally, wolfSSL supports the following hardware systems involving TRNGs: Espressif ESP32-WROOM-32. Intel SGX.
Yes, RDSEED is a true random number generator. From Wikipedia: The entropy source for the RDSEED instruction runs asynchronously on a self-timed circuit and uses thermal noise within the silicon to output a random stream of bits at the rate of 3 GHz [...].
RDSEED outputs "true" random bits generated from entropy gathered from a sensor on the chip. RDRAND outputs bits generated from a pseudorandom number generator seeded by the true random number generator.
As pointed out in the other answer, RDRAND is seeded with true randomness. In particular, it frequently reseeds its internal CSPRNG with 128 bits of hardware-generated randomness, guaranteeing a reseed at least once every 511 * 128 bits. See section 4.2.5 of this doc:
https://software.intel.com/en-us/articles/intel-digital-random-number-generator-drng-software-implementation-guide
So in your examples, you used a single 128-bit seed to generate 10 million random draws from rabbit_extract. In the RDRAND version, you had the equivalent of 2.5 million 128-bit draws, meaning that the CSPRING was reseeded at least 2,500,000/511 = 4,892 times.
So instead of 128 bits of entropy going into your rabbit example, there were at least 4,892*128 = 626,176 bits of entropy going into the RDRAND example.
That's much, much more entropy than you're going to get in 0.361 seconds without hardware support. That could matter if you're doing stuff where lots of real randomness is important. One example is Shamir secret sharing of large quantities of data -- not sure if there are others.
So in conclusion -- it's not for speed, it's for high security. The question of whether it's backdoored is troubling, of course, but you can always XOR it with other sources, and at the very least it's not hurting you.
RDRAND
is not just a PRNG. It is a whitened TRNG that is FIPS compliant. The difference is that you can rely on RDRAND
to contain quite a lot of actual entropy directly retrieved from the CPU. So the main use of RDRAND
is to supply entropy to OS/libraries/applications.
The only other good way for applications to retrieve entropy is usually using an OS supplied entropy source such as /dev/random
or /dev/urandom
(which usually draws the entropy from /dev/random
). However, that OS also requires to find the entropy somewhere. Usually tiny differences in disk and network access times are used for this (+ other semi-random input). These devices are not always present, and are not designed as sources of entropy; they are often not very good sources, nor are they very fast. So on systems that support it, RDRAND
is often used as an entropy source for the cryptographically secure random number generator of the operating system.
With regards to speed, especially for games, it is completely valid to use a (non-secure) PRNG. If you want to have a reasonable random seed then seeding it with the result of RDRAND
may be a good idea, although seeding it from the OS supplied RNG may be a more portable and even a more secure option (in case you don't fully trust Intel or the US).
Note that currently RDRAND is implemented using (AES) CTR_DRBG instead of a (less well analysed) stream cipher that was created for speed such as Rabbit, so it should come as no surprise that Rabbit is faster. Even more importantly, it also has to retrieve the entropy from the entropy source within the CPU before it can run.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With