Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to speed up tricky random number generation

I have some code which performs many log, tan and cos operations on doubles. I need this to be as fast as possibly. Currently I use code such as

#include <stdio.h>
#include <stdlib.h>
#include "mtwist.h"
#include <math.h>


int main(void) {
   int i;
   double x;
   mt_seed();
   double u1;
   double u2;
   double w1;
   double w2;
   x = 0;
   for(i = 0; i < 100000000; ++i) {
     u1 = mt_drand();
     u2 = mt_drand();
     w1 = M_PI*(u1-1/2.0);
     w2 = -log(u2);
     x += tan(w1)*(M_PI_2-w1)+log(w2*cos(w1)/(M_PI_2-w1));
   }
   printf("%f\n",x); 

   return EXIT_SUCCESS;
}

I am using gcc.

There are two obvious ways to speed this up. The first is to choose a faster RNG. The second is to speed up the transcendental functions.
To do this I would like to know

  1. How are tan and cos implemented in assembly on x86? My CPU is the AMD FX-8350 if that makes a difference. (Answer fcos for cos and fptan for tan.)
  2. How can you use a lookup table to speed up the calculations? I only need 32 bits of accuracy. Can you make use a table of size 2^16 for example to speed up the tan and cos operations?

The Intel optimization manual says

If there is no critical need to evaluate the transcendental functions using the extended precision of 80 bits, applications should consider an alternate, software-based approach, such as a look-up-table-based algorithm using interpolation techniques. It is possible to improve transcendental performance with these techniques by choosing the desired numeric precision and the size of the look-up table, and by taking advantage of the parallelism of the SSE and the SSE2 instructions.

According to this very helpful table, fcos has latency 154 and fptan has latency 166-231.


You can compile my code using

gcc -O3 -Wall random.c mtwist-1.5/mtwist.c -lm -o random

My C code uses the Mersenne Twister RNG C code from here . You should be able to run my code to test it. If you can't, please let me know.


Update @rhashimoto has sped up my code from 20 seconds to 6 seconds!

The RNG seems like it should be possible to speed up. However in my tests http://www.math.sci.hiroshima-u.ac.jp/~%20m-mat/MT/SFMT/index.html#dSFMT takes exactly the same amount of time (does anyone see anything different). If anyone can find a faster RNG (that passes all the diehard tests) I would be very grateful.

Please show real timings for any improvement you suggest as that really helps work out what does or does not work.

like image 912
graffe Avatar asked May 23 '14 16:05

graffe


People also ask

What are the earliest method for generating random numbers?

The earliest methods for generating random numbers, such as dice, coin flipping and roulette wheels, are still used today, mainly in games and gambling as they tend to be too slow for most applications in statistics and cryptography.

How fast is a random number generator?

Laser generates quantum randomness at a rate of 250 trillion bits per second, and could lead to devices small enough to fit on a single chip. Researchers have built the fastest random-number generator ever made, using a simple laser.

Can you predict random number generator?

Yes, it is possible to predict what number a random number generator will produce next. I've seen this called cracking, breaking, or attacking the RNG. Searching for any of those terms along with "random number generator" should turn up a lot of results.

What is the best pseudo-random number generator?

The Mersenne Twister is one of the most extensively tested random number generators in existence.


1 Answers

You can rewrite

tan(w1)*(M_PI_2-w1)+log(w2*cos(w1)/(M_PI_2-w1))

as

tan(w1)*(M_PI_2-w1) + log(cos(w1)/(M_PI_2-w1)) + log(w2).

You can probably horse around with minimax polynomials for the stuff depending on w1 here. Make up 64 of them or so, each one for 1/64th of the range, and you probably only need degree 3 or 4.

You computed w2 as

w2 = -log(u2);

for a uniform u2 in (0,1). So you're really computing log(log(1/u2)). I bet you can use a similar trick to get piecewise polynomial approximations to log(log(1/x)) on chunks of (0,1). (The function acts scary near 0 and 1, so you might need to do something fancy there instead.)

like image 101
tmyklebu Avatar answered Sep 21 '22 03:09

tmyklebu