Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Entropy and parallel random number generator seeding

I have a loop where I am adding noise to some points; these are being later used as the basis for some statistical tests.

The datasets involved are quite large, so I would like to parallelise it using openMP to speed things up. The issue comes up when I want to have multiple PRNGs. I have my own PRNG class based upon NR's modulo method (rand4 I think), but I am unsure how to seed the PRNGs correctly to ensure appropriate entropy

Normalliy I would do something like this

prng.initTimer();

But if I have an array of prngs, one per worker thread, then I cannot simply call initTimer on each instance -- the timer may not change, and the timers being close may introduce correlation.

I need to protect against natural correlations, not against malicious attackers (this is experimental data), so I need to have a safe way of seeding the rng array.

I thought of simply using

prng[0].initTimer()
for(int i=1; i<numRNGs; i++)
     prng[i].init(prng[0].getRandNum());

Then calling my loop, but am unsure if this will introduce correlations in the modulo method.

like image 702
CppUser123 Avatar asked Apr 23 '11 15:04

CppUser123


2 Answers

Seeding PRNGs doesn't necessary create independent streams. You should seed only the the first instance (call it reference) and initialise the remaining instances by fast forwarding the reference instance. This only works if you know how many random numbers each thread will consume and the fast forwarding algorithm is available.

I don't know much about your rand4 (googled it, but nothing specific came up), but you shouldn't assume that it is possible to create independent streams just by seeding. You probably want to use different (a better) PRNG. Take a look at WELL. It is fast, has good statistical properties and developed by well know experts. WELL 512 and 1024 are one of the fastest PRNGs available and both have huge periods. You can initialise several WELL instances with distinct seeds in order to create independent streams. Thanks to huge period there is almost zero chance that your PRNGs will generate overlapping streams of random numbers.

If your PRNGs are called frequently, beware of false sharing. This Herb Sutter's article explains how false sharing can kill multi-core performance. Packing multiple PRNGs into a contiguous array is almost a perfect recipe for false sharing. In order to avoid false sharing either add padding between PRNGs or allocate PRNGs on heap/free store. In the later case each RNG should be allocated individually using some sort of aligned allocator. Your compiler should provide a version of aligned malloc. Check the docs (well, googling is actually faster than reading manuals). Visual C++ has _aligned_malloc, GCC has memalign and posix_memalign. The aliment value must be a multiple of CPU's cache line size. The common practice is to align along 128 byte boundaries. For portable solution you can use TBB's cache aligned allocator.

like image 176
pic11 Avatar answered Oct 27 '22 03:10

pic11


I think it depends on the properties of your PRNG. Usual PRNGs weaknesses are lower entropy in the lower bits and lower entropy for the first n values. So I think you should check your PRNG for such weaknesses and change your code accordingly.

Perhaps some of the diehard tests give useful information, but you can also check the first n values and their statistical properties like sum and variance yourself and compare them to the expected values.

For example, seed the PRNG and sum up the first 100 values modulo 11 of your PRNG, repeat this R times. If the total sum is very different from the expected (5*100*R), your PRNG suffers from one or both weaknesses mentioned above.

Knowing nothing about the PRNG, I'd feel safer using something like this:

prng[0].initTimer();
// Throw the first 100 values away
for(int i=1; i < 100; i++)
   prng[0].getRandNum();
// Use only higher bits for seed values (assuming 32 bit size)
for(int i=1; i<numRNGs; i++)
   prng[i].init(((prng[0].getRandNum() >> 16) << 16)
               + (prng[0].getRandNum() >> 16));

But of course, these are speculations about the PRNG. With an ideal PRNG, your approach should work fine as it is.

like image 42
schnaader Avatar answered Oct 27 '22 04:10

schnaader