Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Same random numbers in C++ as computed by Python3 numpy.random.rand

Tags:

c++

python

random

I would like to duplicate in C++ the testing for some code that has already been implemented in Python3 which relies on numpy.random.rand and randn values and a specific seed (e.g., seed = 1).

I understand that Python's random implementation is based on a Mersenne twister. The C++ standard library also supplies this in std::mersenne_twister_engine.

The C++ version returns an unsigned int, whereas Python rand is a floating point value.

Is there a way to obtain the same values in C++ as are generated in Python, and be sure that they are the same? And the same for an array generated by randn ?

like image 295
Madeleine P. Vincent Avatar asked Jan 24 '21 14:01

Madeleine P. Vincent


2 Answers

You can do it this way for integer values:

import numpy as np

np.random.seed(12345)
print(np.random.randint(256**4, dtype='<u4', size=1)[0])
#include <iostream>
#include <random>

int main()
{
    std::mt19937 e2(12345);
    std::cout << e2() << std::endl;
}

The result of both snippets is 3992670690


By looking at source code of rand you can implement it in your C++ code this way:

import numpy as np

np.random.seed(12345)
print(np.random.rand())
#include <iostream>
#include <iomanip>
#include <random>

int main()
{
    std::mt19937 e2(12345);
    int a = e2() >> 5;
    int b = e2() >> 6;
    double value = (a * 67108864.0 + b) / 9007199254740992.0;
    std::cout << std::fixed << std::setprecision(16) << value << std::endl;
}

Both random values are 0.9296160928171479


It would be convenient to use std::generate_canonical, but it uses another method to convert the output of Mersenne twister to double. The reason they differ is likely that generate_canonical is more optimized than the random generator used in NumPy, as it avoids costly floating point operations, especially multiplication and division, as seen in source code. However it seems to be implementation dependent, while NumPy produces the same result on all platforms.

double value = std::generate_canonical<double, std::numeric_limits<double>::digits>(e2);

This doesn't work and produces result 0.8901547132827379, which differs from the output of Python code.

like image 149
fdermishin Avatar answered Oct 02 '22 14:10

fdermishin


For completeness and to avoid re-inventing the wheel, here is an implementation for both numpy.rand and numpy.randn in C++

The header file:

#ifndef RANDOMNUMGEN_NUMPYCOMPATIBLE_H
#define RANDOMNUMGEN_NUMPYCOMPATIBLE_H

#include "RandomNumGenerator.h"
    
//Uniform distribution - numpy.rand
class RandomNumGen_NumpyCompatible {
public:
    RandomNumGen_NumpyCompatible();
    RandomNumGen_NumpyCompatible(std::uint_fast32_t newSeed);

    std::uint_fast32_t min() const { return m_mersenneEngine.min(); }
    std::uint_fast32_t max() const { return m_mersenneEngine.max(); }
    void seed(std::uint_fast32_t seed);
    void discard(unsigned long long);      // NOTE!!  Advances and discards twice as many values as passed in to keep tracking with Numpy order
    uint_fast32_t operator()();            //Simply returns the next Mersenne value from the engine
    double getDouble();                    //Calculates the next uniformly random double as numpy.rand does

    std::string getGeneratorType() const { return "RandomNumGen_NumpyCompatible"; }

private:
    std::mt19937 m_mersenneEngine;
};

///////////////////

//Gaussian distribution - numpy.randn
class GaussianRandomNumGen_NumpyCompatible {
public:
    GaussianRandomNumGen_NumpyCompatible();
    GaussianRandomNumGen_NumpyCompatible(std::uint_fast32_t newSeed);

    std::uint_fast32_t min() const { return m_mersenneEngine.min(); }
    std::uint_fast32_t max() const { return m_mersenneEngine.max(); }
    void seed(std::uint_fast32_t seed);
    void discard(unsigned long long);      // NOTE!!  Advances and discards twice as many values as passed in to keep tracking with Numpy order
    uint_fast32_t operator()();            //Simply returns the next Mersenne value from the engine
    double getDouble();                    //Calculates the next normally (Gaussian) distrubuted random double as numpy.randn does

    std::string getGeneratorType() const { return "GaussianRandomNumGen_NumpyCompatible"; }

private:
    bool m_haveNextVal;
    double m_nextVal;
    std::mt19937 m_mersenneEngine;
};

#endif

And the implementation:

#include "RandomNumGen_NumpyCompatible.h"

RandomNumGen_NumpyCompatible::RandomNumGen_NumpyCompatible()
{
}

RandomNumGen_NumpyCompatible::RandomNumGen_NumpyCompatible(std::uint_fast32_t seed)
: m_mersenneEngine(seed)
{
}

void RandomNumGen_NumpyCompatible::seed(std::uint_fast32_t newSeed)
{
    m_mersenneEngine.seed(newSeed);
}

void RandomNumGen_NumpyCompatible::discard(unsigned long long z)
{
    //Advances and discards TWICE as many values to keep with Numpy order
    m_mersenneEngine.discard(2*z);
}

std::uint_fast32_t RandomNumGen_NumpyCompatible::operator()()
{
    return m_mersenneEngine();
}

double RandomNumGen_NumpyCompatible::getDouble()
{
    int a = m_mersenneEngine() >> 5;
    int b = m_mersenneEngine() >> 6;
    return (a * 67108864.0 + b) / 9007199254740992.0;
}

///////////////////

GaussianRandomNumGen_NumpyCompatible::GaussianRandomNumGen_NumpyCompatible()
: m_haveNextVal(false)
{
}

GaussianRandomNumGen_NumpyCompatible::GaussianRandomNumGen_NumpyCompatible(std::uint_fast32_t seed)
: m_haveNextVal(false), m_mersenneEngine(seed)
{
}

void GaussianRandomNumGen_NumpyCompatible::seed(std::uint_fast32_t newSeed)
{
    m_mersenneEngine.seed(newSeed);
}

void GaussianRandomNumGen_NumpyCompatible::discard(unsigned long long z)
{
    //Burn some CPU cyles here
    for (unsigned i = 0; i < z; ++i)
        getDouble();
}

std::uint_fast32_t GaussianRandomNumGen_NumpyCompatible::operator()()
{
    return m_mersenneEngine();
}

double GaussianRandomNumGen_NumpyCompatible::getDouble()
{
    if (m_haveNextVal) {
        m_haveNextVal = false;
        return m_nextVal;
    }

    double f, x1, x2, r2;
    do {
        int a1 = m_mersenneEngine() >> 5;
        int b1 = m_mersenneEngine() >> 6;
        int a2 = m_mersenneEngine() >> 5;
        int b2 = m_mersenneEngine() >> 6;
        x1 = 2.0 * ((a1 * 67108864.0 + b1) / 9007199254740992.0) - 1.0;
        x2 = 2.0 * ((a2 * 67108864.0 + b2) / 9007199254740992.0) - 1.0;
        r2 = x1 * x1 + x2 * x2;
    } while (r2 >= 1.0 || r2 == 0.0);

    /* Box-Muller transform */
    f = sqrt(-2.0 * log(r2) / r2);
    m_haveNextVal = true;
    m_nextVal = f * x1;
    return f * x2;
}
like image 26
Madeleine P. Vincent Avatar answered Oct 02 '22 14:10

Madeleine P. Vincent