I need to generate 16-bit pseudo-random integers and I am wondering what the best choice is. The obvious way that comes in my mind is something as follows: <pre class="prettyprint"><code>std::random_device rd; auto seed_data = std::array<int, std::mt19937::state_size> {}; std::generate(std::begin(seed_data), std::end(seed_data), std::ref(rd)); std::seed_seq seq(std::begin(seed_data), std::end(seed_data)); std::mt19937 generator(seq); std::uniform_int_distribution<short> dis(std::numeric_limits<short>::min(), std::numeric_limits<short>::max()); short n = dis(generator); </code></pre> The problem I see here is that <code>std::mt19937</code> produces 32-bit unsigned integers since it's defined as this: <pre class="prettyprint"><code>using mt19937 = mersenne_twister_engine<unsigned int, 32, 624, 397, 31, 0x9908b0df, 11, 0xffffffff, 7, 0x9d2c5680, 15, 0xefc60000, 18, 1812433253>; </code></pre> That means static casting is done and only the least significant part of these 32-bit integers is used by the distribution. So I am wondering how good are these series of pseudo-random shorts and I don't have the mathematical expertise to answer that. I expect that a better solution would be to use your own defined <code>mersenne_twister_engine</code> engine for 16-bit integers. However, I haven't found any mentioned set for the template arguments (requirements can be found here for instance). Are there any? UPDATE: I updated the code sample with proper initialization for the distribution.

There may be a misconception, considering this quote from OP's question (emphasis mine): <blockquote> The problem I see here is that std::mt19937 produces 32-bit unsigned integers […]. That means static casting is done and only the least significant part of these 32-bit integers is used by the distribution. </blockquote> That's not how it works. The following are quotes from https://en.cppreference.com/w/cpp/numeric/random <blockquote> The random number library provides classes that generate random and pseudo-random numbers. These classes include: <ul> <li>Uniform random bit generators (URBGs), […]; </li> <li>Random number distributions (e.g. uniform, normal, or poisson distributions) which convert the output of URBGs into various statistical distributions</li> </ul> URBGs and distributions are designed to be used together to produce random values. </blockquote> So a uniform random bit generator, like <code>mt19937</code> or <code>random_device</code> <blockquote> is a function object returning unsigned integer values such that each value in the range of possible results has (ideally) equal probability of being returned. </blockquote> While a random number distribution, like <code>uniform_int_distribution</code> <blockquote> post-processes the output of a URBG in such a way that resulting output is distributed according to a defined statistical probability density function. </blockquote> The way it's done uses all the bits from the source to produce an output. As an example, we can look at the implementation of <code>std::uniform_distribution</code> in <code>libstdc++</code> (starting at line 824), which can be roughly simplified as <pre class="prettyprint"><code>template <typename Type> class uniform_distribution { Type a_ = 0, b_ = std::numeric_limits<Type>::max(); public: uniform_distribution(Type a, Type b) : a_{a}, b_{b} {} template<typename URBG> Type operator() (URBG &gen) { using urbg_type = std::make_unsigned_t<typename URBG::result_type>; using u_type = std::make_unsigned_t<Type>; using max_type = std::conditional_t<(sizeof(urbg_type) > sizeof(u_type)) , urbg_type, u_type>; urbg_type urbg_min = gen.min(); urbg_type urbg_max = gen.max(); urbg_type urbg_range = urbg_max - urbg_min; max_type urange = b_ - a_; max_type udenom = urbg_range <= urange ? 1 : urbg_range / (urange + 1); Type ret; // Note that the calculation may require more than one call to the generator do ret = (urbg_type(gen()) - urbg_min ) / udenom; // which is 'ret = gen / 65535' with OP's parameters // not a simple cast or bit shift while (ret > b_ - a_); return ret + a_; } }; </code></pre> This could be tested HERE.

Generating pseudo-random 16-bit integers

Tags:

c++

random

c++11

mersenne-twister

I need to generate 16-bit pseudo-random integers and I am wondering what the best choice is.

The obvious way that comes in my mind is something as follows:

std::random_device rd;
auto seed_data = std::array<int, std::mt19937::state_size> {};
std::generate(std::begin(seed_data), std::end(seed_data), std::ref(rd));
std::seed_seq seq(std::begin(seed_data), std::end(seed_data));
std::mt19937 generator(seq);
std::uniform_int_distribution<short> dis(std::numeric_limits<short>::min(), 
                                         std::numeric_limits<short>::max());

short n = dis(generator);

The problem I see here is that std::mt19937 produces 32-bit unsigned integers since it's defined as this:

using mt19937 = mersenne_twister_engine<unsigned int, 
                                        32, 624, 397, 
                                        31, 0x9908b0df,
                                        11, 0xffffffff, 
                                        7, 0x9d2c5680, 
                                        15, 0xefc60000, 
                                        18, 1812433253>;

That means static casting is done and only the least significant part of these 32-bit integers is used by the distribution. So I am wondering how good are these series of pseudo-random shorts and I don't have the mathematical expertise to answer that.

I expect that a better solution would be to use your own defined mersenne_twister_engine engine for 16-bit integers. However, I haven't found any mentioned set for the template arguments (requirements can be found here for instance). Are there any?

UPDATE: I updated the code sample with proper initialization for the distribution.

241

asked Jan 09 '19 13:01

Marius Bancila

2 Answers

Your way is indeed the correct way.

The mathematical arguments are complex (I'll try to dig out a paper), but taking the least significant bits of the Mersenne Twister, as implemented by the C++ standard library, is the correct thing to do.

If you're in any doubt as to the quality of the sequence, then run it through the diehard tests.

162

answered Oct 12 '22 11:10

Bathsheba

There may be a misconception, considering this quote from OP's question (emphasis mine):

The problem I see here is that std::mt19937 produces 32-bit unsigned integers […]. That means static casting is done and only the least significant part of these 32-bit integers is used by the distribution.

That's not how it works.

The following are quotes from https://en.cppreference.com/w/cpp/numeric/random

The random number library provides classes that generate random and pseudo-random numbers. These classes include:

Uniform random bit generators (URBGs), […];

Random number distributions (e.g. uniform, normal, or poisson distributions) which convert the output of URBGs into various statistical distributions

URBGs and distributions are designed to be used together to produce random values.

So a uniform random bit generator, like mt19937 or random_device

is a function object returning unsigned integer values such that each value in the range of possible results has (ideally) equal probability of being returned.

While a random number distribution, like uniform_int_distribution

post-processes the output of a URBG in such a way that resulting output is distributed according to a defined statistical probability density function.

The way it's done uses all the bits from the source to produce an output. As an example, we can look at the implementation of std::uniform_distribution in libstdc++ (starting at line 824), which can be roughly simplified as

template <typename Type>
class uniform_distribution
{
    Type a_ = 0, b_ = std::numeric_limits<Type>::max();
public:
    uniform_distribution(Type a, Type b) : a_{a}, b_{b} {}
    template<typename URBG>
    Type operator() (URBG &gen)
    {
        using urbg_type = std::make_unsigned_t<typename URBG::result_type>;
        using u_type    = std::make_unsigned_t<Type>;
        using max_type  = std::conditional_t<(sizeof(urbg_type) > sizeof(u_type))
                                            , urbg_type, u_type>;

        urbg_type urbg_min = gen.min();
        urbg_type urbg_max = gen.max();
        urbg_type urbg_range = urbg_max - urbg_min;

        max_type urange = b_ - a_;
        max_type udenom = urbg_range <= urange ? 1 : urbg_range / (urange + 1);

        Type ret;
        // Note that the calculation may require more than one call to the generator
        do
            ret = (urbg_type(gen()) - urbg_min ) / udenom;
            // which is 'ret = gen / 65535' with OP's parameters
            // not a simple cast or bit shift
        while (ret > b_ - a_);
        return ret + a_;
    }
};

This could be tested HERE.

answered Oct 12 '22 10:10

Bob__

Related questions
                            
                                On what base fold expression of a parameter pack consisting of a single element is transformed into unparenthesized expression
                            
                                Clion Unintialized record type: player
                            
                                Is an un-delayed infinite while loop bad practice? [closed]
                            
                                std::promise<void> throws Unknown error, unless calling sleep
                            
                                Does Multiple reader single writer implementation in g++-4.4(Not in C++11/14) via boost::shared_mutex impact performance?
                            
                                Move a vector<T*> to vector<const T*>
                            
                                Result of ternary operator on `int` and `float`
                            
                                Why can C++ const references be collasped into non-const references
                            
                                Complexity of algorithm std::includes in c++
                            
                                In C++11 threads, what guarantees does a std::mutex have about memory visibility?
                            
                                What is going on here? I assign result to result in C++
                            
                                Why is one of these dynamic programming implementations of the Fibonacci sequence faster than the other?
                            
                                Avoid indirect instantiation from private constructor through operation
                            
                                "Memory Fragmentation" is it still an issue? [closed]
                            
                                GTK: get rid of the system theme/CSS alltogether
                            
                                std::unordered_map::extract references/pointers invalidation
                            
                                Can I rely on a function-scoped static variable for a method called during program shutdown?
                            
                                Is `this` allowed inside a noexcept specification?
                            
                                How do I denote a pure virtual function in a UML class diagram?
                            
                                Clang claims that `member reference base type 'X' is not a structure or union`, but X is a structure template with deduced parameters

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With