Recent Intel chips (Ivy Bridge and up) have instructions for generating (pseudo) random bits. RDSEED
outputs "true" random bits generated from entropy gathered from a sensor on the chip. RDRAND
outputs bits generated from a pseudorandom number generator seeded by the true random number generator. According to Intel's documentation, RDSEED
is slower, since gathering entropy is costly. Thus, RDRAND
is offered as a cheaper alternative, and its output is sufficiently secure for most cryptographic applications. (This is analogous to the /dev/random
versus /dev/urandom
on Unix systems.)
I was curious about the performance difference between the two instructions, so I wrote some code to compare them. To my surprise, I find there is virtually no difference in performance. Could anyone provide an explanation? Code and system details follow.
/* Compare the performance of RDSEED and RDRAND.
*
* Compute the CPU time used to fill a buffer with (pseudo) random bits
* using each instruction.
*
* Compile with: gcc -mdrnd -mdseed
*/
#include <time.h>
#include <stdio.h>
#include <stdlib.h>
#include <x86intrin.h>
#define BUFSIZE (1<<24)
int main() {
unsigned int ok, i;
unsigned long long *rand = malloc(BUFSIZE*sizeof(unsigned long long)),
*seed = malloc(BUFSIZE*sizeof(unsigned long long));
clock_t start, end, bm;
// RDRAND (the benchmark)
start = clock();
for (i = 0; i < BUFSIZE; i++) {
ok = _rdrand64_step(&rand[i]);
}
bm = clock() - start;
printf("RDRAND: %li\n", bm);
// RDSEED
start = clock();
for (i = 0; i < BUFSIZE; i++) {
ok = _rdseed64_step(&seed[i]);
}
end = clock();
printf("RDSEED: %li, %.2lf\n", end - start, (double)(end-start)/bm);
free(rand);
free(seed);
return 0;
}
You aren't checking the return value, so you don't how many actual random numbers you have generated. With retry, as Florian suggested the RDSEED
version is more than 3 times slower:
RDRAND: 1989817
RDSEED: 6636792, 3.34
Under the covers, the hardware entropy source probably generates only at a limited rate, and this causes RDSEED
to fail when called at a rate faster than the entropy can regenerate. RDRAND
, on the other hand, is only generating a pseudo-random sequence based on periodic re-seeding, so it is unlikely to fail.
Here is the modified code excerpt:
// RDRAND (the benchmark)
start = clock();
for (i = 0; i < BUFSIZE; i++) {
while (!_rdrand64_step(&rand[i]))
;
}
bm = clock() - start;
printf("RDRAND: %li\n", bm);
// RDSEED
start = clock();
for (i = 0; i < BUFSIZE; i++) {
while (!_rdseed64_step(&seed[i]))
;
}
end = clock();
For me, on a Core m7-6Y75, the RDSEED
in your test program occasionally fails (I added two assert (ok);
s, and the second one fails occasionally). Correct code would retry, resulting in a performance difference in favor of RDRAND
. (Retrying is required for RDRAND
as well, but it does not seem to happen in practice, so RDRAND
is faster.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With