Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ super fast thread-safe rand function

void NetClass::Modulate(vector <synapse> & synapses )
{
    int size = synapses.size();
    int split = 200 * 0.5;

    for(int w=0; w < size; w++)
        if(synapses[w].active)
            synapses[w].rmod = ((rand_r(seedp) % 200 - split ) / 1000.0);
}

The function rand_r(seedp) is seriously bottle-necking my program. Specifically, its slowing me by 3X when run serialy, and 4.4X when run on 16 cores. rand() is not an option because its even worse. Is there anything I can do to streamline this? If it will make a difference, I think I can sustain a loss in terms of statistical randomness. Would pre-generating (before execution) a list of random numbers and then loading to the thread stacks be an option?

like image 867
Matt Munson Avatar asked Nov 27 '11 10:11

Matt Munson


2 Answers

Problem is that seedp variable (and its memory location) is shared among several threads. Processor cores must synchronize their caches each time they access this ever changing value, which hampers performance. The solution is that all threads work with their own seedp, and so avoid cache synchronization.

like image 125
Dialecticus Avatar answered Oct 04 '22 18:10

Dialecticus


It depends on how good the statistical randomness needs to be. For high quality, the Mersenne twister, or its SIMD variant, is a good choice. You can generate and buffer a large block of pseudo-random numbers at a time, and each thread can have its own state vector. The Park-Miller-Carta PRNG is extremely simple - these guys even implemented it as a CUDA kernel.

like image 23
Brett Hale Avatar answered Oct 04 '22 19:10

Brett Hale