Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Non-reproducible random numbers using `<random>`

Tags:

c++

random

I am trying to create a class that produces random numbers for multiple distributions, while keeping them reproducible (by setting an initial seed).

The code seems to work, until I start to use the normal distribution and weird errors surface. These are mainly:

  • If I uncomment the double a = rnd.rnorm(0.0, 1.0);-line (line 40) (that is if I call rnorm before setting a seed), the first random number of the normal distribution does not match anymore, the random numbers afterwards match again
  • If I retrieve an odd-number of random numbers from the normal distribution, the normal random numbers are shifted by one (for example by setting line 39 to int n = 3;)
  • If I do the two things together, the random numbers get shifted by one in the other direction (lead)

Now my question is, what causes this weird behavior? Have I implemented RNG in a wrong way? And most importantly, how can I fix it?

Code

If you want to test the results yourself you can use this http://cpp.sh/9phre

or this

#include <stdio.h>
#include <random>

// Class to create random numbers 
// Main functions to set the seed: setseed()
// create uniformly distributed values: runif()
// and normally distributed values: rnorm()
class RNG {
public:
    RNG(int seed = (int) time(0)) {
        setseed(seed);
    };
    ~RNG() {};
    void setseed(int newSeed) {
        re.seed(newSeed);
    };

    double runif(double minNum, double maxNum) {
        return dud(re, distUnifDbl::param_type{minNum, maxNum});
    };
    double rnorm(double mu, double sd) {
        return dnd(re, distNormDbl::param_type{mu, sd});
    };

private:
    // take the Mersenne-Twister Engine
    std::mt19937 re {};
    // create the uniform distribution
    using distUnifDbl = std::uniform_real_distribution<double>;
    distUnifDbl dud {};
    // create the normal distribution
    using distNormDbl = std::normal_distribution<double>;
    distNormDbl dnd {};

};

int main(int argc, char const *argv[]) {
    RNG rnd;
    int n = 4; // setting n to an odd number, makes _all_ normal numbers non-reproducible
    //double a = rnd.rnorm(0.0, 1.0); // uncommenting this, makes the _first_ normal number non-reproducible

    printf("Testing some Uniform Numbers\n");
    rnd.setseed(123);
    for (int i = 0; i < n; ++i) {
        printf("% 13.10f ", rnd.runif(0.0, 1.0));
    }
    rnd.setseed(123);
    printf("\n");
    for (int i = 0; i < n; ++i) {
        printf("% 13.10f ", rnd.runif(0.0, 1.0));
    }
    printf("\n");

    printf("\nTesting some Normal Numbers\n");
    rnd.setseed(123);
    for (int i = 0; i < n; ++i) {
        printf("% 13.10f ", rnd.rnorm(0.0, 1.0));
    }
    rnd.setseed(123);
    printf("\n");
    for (int i = 0; i < n; ++i) {
        printf("% 13.10f ", rnd.rnorm(0.0, 1.0));
    }
    printf("\n");
    return 0;
}

Results

Base-case

When setting n = 4 and leaving a commented, I receive the following (which is exactly what I want/need; reproducible "random" numbers):

Testing some Uniform Numbers
 0.7129553216  0.4284709250  0.6908848514  0.7191503089 
 0.7129553216  0.4284709250  0.6908848514  0.7191503089 

Testing some Normal Numbers
-0.5696096995  1.6958337120  1.1108714913  0.9675940713 
-0.5696096995  1.6958337120  1.1108714913  0.9675940713 

Error 1

Now for the errors. Setting n = 5 (or any odd number), I receive:

Testing some Uniform Numbers
 0.7129553216  0.4284709250  0.6908848514  0.7191503089  0.4911189328 
 0.7129553216  0.4284709250  0.6908848514  0.7191503089  0.4911189328 

Testing some Normal Numbers
-0.5696096995  1.6958337120  1.1108714913  0.9675940713  1.5213608069 
-0.0482498863 -0.5696096995  1.6958337120  1.1108714913  0.9675940713 

Which apparently shifts all normal numbers by 1. The uniform numbers stay intact (which is good, I guess).

Error 2

Uncommenting the one line (i.e., calling rnd.rnorm(0.0, 1.0) once before setting the seeds), leads to the following output (with n = 4 or any other even number)

Testing some Uniform Numbers
 0.7129553216  0.4284709250  0.6908848514  0.7191503089 
 0.7129553216  0.4284709250  0.6908848514  0.7191503089 

Testing some Normal Numbers
 0.9761557076 -0.5696096995  1.6958337120  1.1108714913 
 0.9675940713 -0.5696096995  1.6958337120  1.1108714913 

Which apparently breaks only the first normal random number, again leaving the uniform numbers ok.

Error 3

Using the two points together (leaving the line uncommented and setting n to an odd-number), I get this

Testing some Uniform Numbers
 0.7129553216  0.4284709250  0.6908848514  0.7191503089  0.4911189328 
 0.7129553216  0.4284709250  0.6908848514  0.7191503089  0.4911189328 

Testing some Normal Numbers
-0.4553400276 -0.5696096995  1.6958337120  1.1108714913  0.9675940713 
-0.5696096995  1.6958337120  1.1108714913  0.9675940713  1.5213608069 

Now the second number of normal random numbers gets shifted by one into the other direction (lead).

System spec

I am using this on an Ubuntu 16.04 and g++ --version g++(Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609

Updates

It doesn't seem to be connected to the specific generator, i.e., replacing the std::mt19937 re {}; with std:: linear_congruential_engine<std::uint_fast32_t, 48271, 0, 2147483647> re {};, or with std::subtract_with_carry_engine<std::uint_fast64_t, 48, 5, 12> re{}; results in the same behavior (but obviously with different numbers).

like image 789
David Avatar asked Aug 29 '17 10:08

David


2 Answers

void setseed(int newSeed) {
        re.seed(newSeed);
        dud.reset(); // <---- 
        dnd.reset(); 
    };

Distributions have internal state. You need to reset it in order to get the same sequence again.

like image 76
n. 1.8e9-where's-my-share m. Avatar answered Nov 15 '22 12:11

n. 1.8e9-where's-my-share m.


If reproducible "random" numbers are something you care about, you should avoid C++ distributions, including uniform_real_distribution and normal_distribution, and instead rely on your own way to transform the pseudorandom numbers from mt19937 into the numbers you desire. (For many ways to do so, see my page on sampling methods. Note that there are other things to consider when reproducibility is important.)

C++ distribution classes, such as uniform_real_distribution, have no standard implementation. As a result, even if the same seed is passed to these distributions, the sequence of numbers they deliver can vary, even from run to run, depending on how these distributions are implemented. Note that it's not the "compiler", the "operating system", or the "architecture" that decides which algorithm is used, but rather the C++ standard library implementation decides. See also this question.

On the other hand, random engines such as mt19937 do have a guaranteed implementation; they will return the same pseudorandom numbers for the same seed, even across runs, in all compliant C++ library implementations (including those of different "architectures"). The exception is default_random_engine.

See also this question: Generate the same sequence of random numbers in C++ from a given seed.

like image 39
Peter O. Avatar answered Nov 15 '22 12:11

Peter O.