Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generating pseudo-random 16-bit integers

I need to generate 16-bit pseudo-random integers and I am wondering what the best choice is.

The obvious way that comes in my mind is something as follows:

std::random_device rd;
auto seed_data = std::array<int, std::mt19937::state_size> {};
std::generate(std::begin(seed_data), std::end(seed_data), std::ref(rd));
std::seed_seq seq(std::begin(seed_data), std::end(seed_data));
std::mt19937 generator(seq);
std::uniform_int_distribution<short> dis(std::numeric_limits<short>::min(), 
                                         std::numeric_limits<short>::max());

short n = dis(generator);

The problem I see here is that std::mt19937 produces 32-bit unsigned integers since it's defined as this:

using mt19937 = mersenne_twister_engine<unsigned int, 
                                        32, 624, 397, 
                                        31, 0x9908b0df,
                                        11, 0xffffffff, 
                                        7, 0x9d2c5680, 
                                        15, 0xefc60000, 
                                        18, 1812433253>;

That means static casting is done and only the least significant part of these 32-bit integers is used by the distribution. So I am wondering how good are these series of pseudo-random shorts and I don't have the mathematical expertise to answer that.

I expect that a better solution would be to use your own defined mersenne_twister_engine engine for 16-bit integers. However, I haven't found any mentioned set for the template arguments (requirements can be found here for instance). Are there any?

UPDATE: I updated the code sample with proper initialization for the distribution.

like image 241
Marius Bancila Avatar asked Jan 09 '19 13:01

Marius Bancila


People also ask

What generates pseudorandom integer?

The seed() function will seed the pseudorandom number generator, taking an integer value as an argument, such as 1 or 7. If the seed() function is not called prior to using randomness, the default is to use the current system time in milliseconds from epoch (1970).

What is pseudocode random number generator?

A pseudorandom number generator (PRNG), also known as a deterministic random bit generator (DRBG), is an algorithm for generating a sequence of numbers whose properties approximate the properties of sequences of random numbers.

How do you generate pseudo random numbers in C++?

The srand() function in C++ can perform pseudo-random number calculation. This function requires a seed value which forms the basis of computation of random numbers. With the help of the seed value, srand() sets the stage for the generation of pseudo-random numbers by the rand() function. And Voila!


2 Answers

Your way is indeed the correct way.

The mathematical arguments are complex (I'll try to dig out a paper), but taking the least significant bits of the Mersenne Twister, as implemented by the C++ standard library, is the correct thing to do.

If you're in any doubt as to the quality of the sequence, then run it through the diehard tests.

like image 162
Bathsheba Avatar answered Oct 12 '22 11:10

Bathsheba


There may be a misconception, considering this quote from OP's question (emphasis mine):

The problem I see here is that std::mt19937 produces 32-bit unsigned integers […]. That means static casting is done and only the least significant part of these 32-bit integers is used by the distribution.

That's not how it works.

The following are quotes from https://en.cppreference.com/w/cpp/numeric/random

The random number library provides classes that generate random and pseudo-random numbers. These classes include:

  • Uniform random bit generators (URBGs), […];
  • Random number distributions (e.g. uniform, normal, or poisson distributions) which convert the output of URBGs into various statistical distributions

URBGs and distributions are designed to be used together to produce random values.

So a uniform random bit generator, like mt19937 or random_device

is a function object returning unsigned integer values such that each value in the range of possible results has (ideally) equal probability of being returned.

While a random number distribution, like uniform_int_distribution

post-processes the output of a URBG in such a way that resulting output is distributed according to a defined statistical probability density function.

The way it's done uses all the bits from the source to produce an output. As an example, we can look at the implementation of std::uniform_distribution in libstdc++ (starting at line 824), which can be roughly simplified as

template <typename Type>
class uniform_distribution
{
    Type a_ = 0, b_ = std::numeric_limits<Type>::max();
public:
    uniform_distribution(Type a, Type b) : a_{a}, b_{b} {}
    template<typename URBG>
    Type operator() (URBG &gen)
    {
        using urbg_type = std::make_unsigned_t<typename URBG::result_type>;
        using u_type    = std::make_unsigned_t<Type>;
        using max_type  = std::conditional_t<(sizeof(urbg_type) > sizeof(u_type))
                                            , urbg_type, u_type>;

        urbg_type urbg_min = gen.min();
        urbg_type urbg_max = gen.max();
        urbg_type urbg_range = urbg_max - urbg_min;

        max_type urange = b_ - a_;
        max_type udenom = urbg_range <= urange ? 1 : urbg_range / (urange + 1);

        Type ret;
        // Note that the calculation may require more than one call to the generator
        do
            ret = (urbg_type(gen()) - urbg_min ) / udenom;
            // which is 'ret = gen / 65535' with OP's parameters
            // not a simple cast or bit shift
        while (ret > b_ - a_);
        return ret + a_;
    }
};

This could be tested HERE.

like image 36
Bob__ Avatar answered Oct 12 '22 10:10

Bob__