Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parallel loops and Random produce odd results

I just started playing with the Task Parallel Library, and ran into interesting issues; I have a general idea of what is going on, but would like to hear comments from people more competent than me to help understand what is happening. My apologies for the somewhat lengthy code.

I started with a non-parallel simulation of a random walk:

 var random = new Random();
 Stopwatch stopwatch = new Stopwatch();

 stopwatch.Start();

 var simulations = new List<int>();
 for (var run = 0; run < 20; run++)
 {
    var position = 0;
    for (var step = 0; step < 10000000; step++)
    {
       if (random.Next(0, 2) == 0)
       {
          position--;
       }
       else
       {
          position++;
       }
    }

    Console.WriteLine(string.Format("Terminated run {0} at position {1}.", run, position));
    simulations.Add(position);
 }

 Console.WriteLine(string.Format("Average position: {0} .", simulations.Average()));
 stopwatch.Stop();

 Console.WriteLine(string.Format("Time elapsed: {0}", stopwatch.ElapsedMilliseconds));
 Console.ReadLine();

I then wrote my first attempt at a parallel loop:

 var localRandom = new Random();

 stopwatch.Reset();
 stopwatch.Start();

 var parallelSimulations = new List<int>();
 Parallel.For(0, 20, run =>
 {
    var position = 0;
    for (var step = 0; step < 10000000; step++)
    {
       if (localRandom.Next(0, 2) == 0)
       {
          position--;
       }
       else
       {
          position++;
       }
    }

    Console.WriteLine(string.Format("Terminated run {0} at position {1}.", run, position));
    parallelSimulations.Add(position);
 });


 Console.WriteLine(string.Format("Average position: {0} .", parallelSimulations.Average()));
 stopwatch.Stop();

 Console.WriteLine(string.Format("Time elapsed: {0}", stopwatch.ElapsedMilliseconds));

 Console.ReadLine();

When I ran it on a virtual machine set to use 1 core only, I observed a similar duration, but the runs are no longer processed in order - no surprise.

When I ran it on a dual-core machine, things went odd. I saw no improvement in time, and observed some very weird results for each run. Most runs end up with results of -1,000,000, (or very close), which indicates that Random.Next is returning 0 quasi all the time.

When I make the random local to each loop, everything works just fine, and I get the expected duration improvement:

Parallel.For(0, 20, run =>
         {
            var localRandom = new Random();
            var position = 0; 

My guess is that the problem has to do with the fact that the Random object is shared between the loops, and has some state. The lack of improvement in duration in the "failing parallel" version is I assume due to that fact that the calls to Random are not processed in parallel (even though I see that the parallel version uses both cores, whereas the original doesn't). The piece I really don't get is why the simulation results are what they are.

One separate worry I have is that if I use Random instances local to each loop, I may run into the problem of having multiple loops starting with the same seed (the issue you get when you generate multiple Randoms too close in time, resulting in identical sequences).

Any insight in what is going on would be very valuable to me!

like image 873
Mathias Avatar asked May 27 '10 20:05

Mathias


3 Answers

Neither of these approaches will give you really good random numbers.

This blog post covers a lot of approaches for getting better random numbers with Random

Link

These may be fine for many day to day applications.

However if you use the same random number generator on multiple threads even with different seeds you will still impact the quality of your random numbers. This is because you are generating sequences of pseudo-random numbers which may overlap.

This video explains why in a bit more detail:

http://software.intel.com/en-us/videos/tim-mattson-use-and-abuse-of-random-numbers/

If you want really random numbers then you really need to use the crypto random number generator System.Security.Cryptography.RNGCryptoServiceProvider. This is threadsafe.

like image 164
Ade Miller Avatar answered Nov 07 '22 06:11

Ade Miller


The Random class is not thread-safe; if you use it on multiple threads, it can get messed up.

You should make a separate Random instance on each thread, and make sure that they don't end up using the same seed. (eg, Environment.TickCount * Thread.CurrentThread.ManagedThreadId)

like image 38
SLaks Avatar answered Nov 07 '22 06:11

SLaks


One core problem:

  • random.Next is not thread safe.

Two ramifications:

  1. Quality of the randomness is destroyed by race conditions.
  2. False sharing destroys scalability on multicores.

Several possible solutions:

  • Make random.Next thread safe: solves quality issue but not scalability.
  • Use multiple PRNGs: solves scalability issue but may degrade quality.
  • ...
like image 39
J D Avatar answered Nov 07 '22 06:11

J D