Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do distributions of C++11 class <random> transform the underlying generator?

Tags:

c++

random

c++11

The following code does not seem to behave intuitively:

#include <random>
#include <iostream>
using namespace std;

int main()
{

  mt19937 MyGenerator(40);
  auto gauss = normal_distribution<double>(0,1);
  auto linear = uniform_real_distribution<double>(0,1);
  cout << gauss(MyGenerator) << endl; //line a
  cout << linear(MyGenerator) << endl; //line b
  cout << gauss(MyGenerator) << endl;
}

Running this code gives the output

-0.816097
 0.705030
 0.303032.

If now the order of lines a and b is swapped, the output changes to

 0.644008
 0.338080
-0.639501.

It is completely clear that the first two numbers are different now, as they are produced by different distributions. Nevertheless, why is the third number different? In my intuition, the distribution should grab a number c = MyGenerator() which is then mapped to the random number in the specific range. The random number generator would point to the next number in the sequence of numbers after the distribution call. So, shouldn't the outcome of the third call be the same in both cases?

Another observation: Adding a forth call to either of the distributions does in fact seem to reproduce the same numbers.

like image 830
user3058865 Avatar asked Sep 05 '14 10:09

user3058865


2 Answers

libstdc++'s implementation of normal_distribution uses the Marsaglia polar method. The interesting thing about this method is that each pass uses two random numbers from the URNG to generate two results.

That is, the first call to the distribution calls the URNG twice (possibly more times, as it uses rejection sampling, but an even number of times) and returns one result; the following call to the distribution will not call the URNG but will return the saved second result.

Here's an extract from the source code, slightly reformatted:

if (_M_saved_available)
{
    _M_saved_available = false;
    ret = _M_saved;
}
else
{
    result_type x, y, r2;
    do
    {
        x = result_type(2.0) * aurng() - 1.0;
        y = result_type(2.0) * aurng() - 1.0;
        r2 = x * x + y * y;
    }
    while (r2 > 1.0 || r2 == 0.0);

    const result_type mult = std::sqrt(-2 * std::log(r2) / r2);
    _M_saved = x * mult;
    _M_saved_available = true;
    ret = y * mult;
}
like image 53
ecatmur Avatar answered Nov 17 '22 01:11

ecatmur


There's no requirement that the distribution call the underlying generator once for each value. Some distributions are best calculated by combining multiple random values.

For example, in the GNU implementation, the implemantation of the uniform distribution is

return (__aurng() * (__p.b() - __p.a())) + __p.a();

calling the generator __aurng once; while the core of the normal distribution is:

do
{
    __x = result_type(2.0) * __aurng() - 1.0;
    __y = result_type(2.0) * __aurng() - 1.0;
    __r2 = __x * __x + __y * __y;
}
while (__r2 > 1.0 || __r2 == 0.0);

calling it at least twice.

like image 28
Mike Seymour Avatar answered Nov 17 '22 00:11

Mike Seymour