The following code does not seem to behave intuitively:
#include <random>
#include <iostream>
using namespace std;
int main()
{
  mt19937 MyGenerator(40);
  auto gauss = normal_distribution<double>(0,1);
  auto linear = uniform_real_distribution<double>(0,1);
  cout << gauss(MyGenerator) << endl; //line a
  cout << linear(MyGenerator) << endl; //line b
  cout << gauss(MyGenerator) << endl;
}
Running this code gives the output
-0.816097
 0.705030
 0.303032.
If now the order of lines a and b is swapped, the output changes to
 0.644008
 0.338080
-0.639501.
It is completely clear that the first two numbers are different now, as they are produced by different distributions. Nevertheless, why is the third number different? In my intuition, the distribution should grab a number c = MyGenerator() which is then mapped to the random number in the specific range. The random number generator would point to the next number in the sequence of numbers after the distribution call. So, shouldn't the outcome of the third call be the same in both cases?
Another observation: Adding a forth call to either of the distributions does in fact seem to reproduce the same numbers.
libstdc++'s implementation of normal_distribution uses the Marsaglia polar method. The interesting thing about this method is that each pass uses two random numbers from the URNG to generate two results.
That is, the first call to the distribution calls the URNG twice (possibly more times, as it uses rejection sampling, but an even number of times) and returns one result; the following call to the distribution will not call the URNG but will return the saved second result.
Here's an extract from the source code, slightly reformatted:
if (_M_saved_available)
{
    _M_saved_available = false;
    ret = _M_saved;
}
else
{
    result_type x, y, r2;
    do
    {
        x = result_type(2.0) * aurng() - 1.0;
        y = result_type(2.0) * aurng() - 1.0;
        r2 = x * x + y * y;
    }
    while (r2 > 1.0 || r2 == 0.0);
    const result_type mult = std::sqrt(-2 * std::log(r2) / r2);
    _M_saved = x * mult;
    _M_saved_available = true;
    ret = y * mult;
}
There's no requirement that the distribution call the underlying generator once for each value. Some distributions are best calculated by combining multiple random values.
For example, in the GNU implementation, the implemantation of the uniform distribution is
return (__aurng() * (__p.b() - __p.a())) + __p.a();
calling the generator __aurng once; while the core of the normal distribution is:
do
{
    __x = result_type(2.0) * __aurng() - 1.0;
    __y = result_type(2.0) * __aurng() - 1.0;
    __r2 = __x * __x + __y * __y;
}
while (__r2 > 1.0 || __r2 == 0.0);
calling it at least twice.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With