How to generate random numbers in parallel?

Tags:

I want to generate pseudorandom numbers in parallel using openMP, something like this:

int i;
#pragma omp parallel for
for (i=0;i<100;i++)
{
    printf("%d %d %d\n",i,omp_get_thread_num(),rand());
} 
return 0;

I've tested it on windows and I got huge speedup, but each thread generated exactly the same numbers. I've tested it also on Linux and I got huge slowdown, parallel version on 8core processor was about 10 time slower than sequential, but each thread generated different numbers.

Is there any way to have both speedup and different numbers?

Edit 27.11.2010
I think I've solved it using an idea from Jonathan Dursi post. It seems that following code works fast on both linux and windows. Numbers are also pseudorandom. What do You think about it?

int seed[10];

int main(int argc, char **argv) 
{
int i,s;
for (i=0;i<10;i++)
    seed[i] = rand();

#pragma omp parallel private(s)
{
    s = seed[omp_get_thread_num()];
    #pragma omp for
    for (i=0;i<1000;i++)
    {
        printf("%d %d %d\n",i,omp_get_thread_num(),s);
        s=(s*17931+7391); // those numbers should be choosen more carefully
    }
    seed[omp_get_thread_num()] = s;
}
return 0; 
}

PS.: I haven't accepted any answer yet, because I need to be sure that this idea is good.

240

asked Nov 26 '10 17:11

Tomek Tarczynski

2 Answers

I'll post here what I posted to Concurrent random number generation :

I think you're looking for rand_r(), which explicitly takes the current RNG state as a parameter. Then each thread should have its own copy of seed data (whether you want each thread to start off with the same seed or different ones depends on what you're doing, here you want them to be different or you'd get the same row again and again). There's some discussion of rand_r() and thread-safety here: whether rand_r is real thread safe? .

So say you wanted each thread to have its seed start off with its thread number (which is probably not what you want, as it would give the same results every time you ran with the same number of threads, but just as an example):

#pragma omp parallel default(none)
{
    int i;
    unsigned int myseed = omp_get_thread_num();
    #pragma omp for
    for(i=0; i<100; i++)
            printf("%d %d %d\n",i,omp_get_thread_num(),rand_r(&myseed));
}

Edit: Just on a lark, checked to see if the above would get any speedup. Full code was

#define NRANDS 1000000
int main(int argc, char **argv) {

    struct timeval t;
    int a[NRANDS];

    tick(&t);
    #pragma omp parallel default(none) shared(a)
    {
        int i;
        unsigned int myseed = omp_get_thread_num();
        #pragma omp for
        for(i=0; i<NRANDS; i++)
                a[i] = rand_r(&myseed);
    }
    double sum = 0.;
    double time=tock(&t);
    for (long int i=0; i<NRANDS; i++) {
        sum += a[i];
    }
    printf("Time = %lf, sum = %lf\n", time, sum);

    return 0;
}

where tick and tock are just wrappers to gettimeofday(), and tock() returns the difference in seconds. Sum is printed just to make sure that nothing gets optimized away, and to demonstrate a small point; you will get different numbers with different numbers of threads because each thread gets its own threadnum as a seed; if you run the same code again and again with the same number of threads you'll get the same sum, for the same reason. Anyway, timing (running on a 8-core nehalem box with no other users):

$ export OMP_NUM_THREADS=1
$ ./rand
Time = 0.008639, sum = 1074808568711883.000000

$ export OMP_NUM_THREADS=2
$ ./rand
Time = 0.006274, sum = 1074093295878604.000000

$ export OMP_NUM_THREADS=4
$ ./rand
Time = 0.005335, sum = 1073422298606608.000000

$ export OMP_NUM_THREADS=8
$ ./rand
Time = 0.004163, sum = 1073971133482410.000000

So speedup, if not great; as @ruslik points out, this is not really a compute-intensive process, and other issues like memory bandwidth start playing a role. Thus, only a shade over 2x speedup on 8 cores.

answered Sep 20 '22 17:09

Jonathan Dursi

You cannot use the C rand() function from multiple threads; this results in undefined behavior. Some implementations might give you locking (which will make it slow); others might allow threads to clobber each other's state, possibly crashing your program or just giving "bad" random numbers.

To solve the problem, either write your own PRNG implementation or use an existing one that allows the caller to store and pass the state to the PRNG iterator function.

answered Sep 20 '22 17:09

R.. GitHub STOP HELPING ICE

Related questions
                            
                                Is it possible to cast floats directly to __m128 if they are 16 byte aligned?
                            
                                how does a C-like compiler interpret the if statement
                            
                                Understanding char array[] and string
                            
                                Decay of multidimensional arrays as function parameters
                            
                                Why can't I use sizeof in a preprocessor condition?
                            
                                Genericity vs type-safety? Using void* in C
                            
                                Static variables in C and C++
                            
                                How to deprecate a C pre-processor macro in GCC?
                            
                                How is Assembly used in the modern day (with C/C++ for example)?
                            
                                Pass extra parameter to comparator for qsort
                            
                                What does this cryptic C code snippet do?
                            
                                Universally compiler independent way of implementing an UNUSED macro in C/C++
                            
                                Determine `sizeof float` without compilation
                            
                                How can I add a new argument to the existing variable argument list?
                            
                                Adding opencv processing to gstreamer application
                            
                                Volatile function
                            
                                Import constants from .h file into python
                            
                                Given an encoded message, count the number of ways it can be decoded
                            
                                Consecutive arrays
                            
                                CPU and Data alignment

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to generate random numbers in parallel?

Tags:

c

random

openmp

Tomek Tarczynski

People also ask

2 Answers

Jonathan Dursi

R.. GitHub STOP HELPING ICE

Recent Activity

Donate For Us