Generating pseudo-random 16-bit integers

9

I need to generate 16-bit pseudo-random integers and I am wondering what the best choice is.

The obvious way that comes in my mind is something as follows:

std::random_device rd;
auto seed_data = std::array<int, std::mt19937::state_size> {};
std::generate(std::begin(seed_data), std::end(seed_data), std::ref(rd));
std::seed_seq seq(std::begin(seed_data), std::end(seed_data));
std::mt19937 generator(seq);
std::uniform_int_distribution<short> dis(std::numeric_limits<short>::min(), 
                                         std::numeric_limits<short>::max());

short n = dis(generator);

The problem I see here is that std::mt19937 produces 32-bit unsigned integers since it's defined as this:

using mt19937 = mersenne_twister_engine<unsigned int, 
                                        32, 624, 397, 
                                        31, 0x9908b0df,
                                        11, 0xffffffff, 
                                        7, 0x9d2c5680, 
                                        15, 0xefc60000, 
                                        18, 1812433253>;

That means static casting is done and only the least significant part of these 32-bit integers is used by the distribution. So I am wondering how good are these series of pseudo-random shorts and I don't have the mathematical expertise to answer that.

I expect that a better solution would be to use your own defined mersenne_twister_engine engine for 16-bit integers. However, I haven't found any mentioned set for the template arguments (requirements can be found here for instance). Are there any?

UPDATE: I updated the code sample with proper initialization for the distribution.

c++
c++11
random
mersenne-twister
asked on Stack Overflow Jan 9, 2019 by Marius Bancila • edited Jan 10, 2019 by Marius Bancila

2 Answers

8

Your way is indeed the correct way.

The mathematical arguments are complex (I'll try to dig out a paper), but taking the least significant bits of the Mersenne Twister, as implemented by the C++ standard library, is the correct thing to do.

If you're in any doubt as to the quality of the sequence, then run it through the diehard tests.

answered on Stack Overflow Jan 9, 2019 by Bathsheba • edited Jan 9, 2019 by Bathsheba
2

There may be a misconception, considering this quote from OP's question (emphasis mine):

The problem I see here is that std::mt19937 produces 32-bit unsigned integers […]. That means static casting is done and only the least significant part of these 32-bit integers is used by the distribution.

That's not how it works.

The following are quotes from https://en.cppreference.com/w/cpp/numeric/random

The random number library provides classes that generate random and pseudo-random numbers. These classes include:

  • Uniform random bit generators (URBGs), […];
  • Random number distributions (e.g. uniform, normal, or poisson distributions) which convert the output of URBGs into various statistical distributions

URBGs and distributions are designed to be used together to produce random values.

So a uniform random bit generator, like mt19937 or random_device

is a function object returning unsigned integer values such that each value in the range of possible results has (ideally) equal probability of being returned.

While a random number distribution, like uniform_int_distribution

post-processes the output of a URBG in such a way that resulting output is distributed according to a defined statistical probability density function.

The way it's done uses all the bits from the source to produce an output. As an example, we can look at the implementation of std::uniform_distribution in libstdc++ (starting at line 824), which can be roughly simplified as

template <typename Type>
class uniform_distribution
{
    Type a_ = 0, b_ = std::numeric_limits<Type>::max();
public:
    uniform_distribution(Type a, Type b) : a_{a}, b_{b} {}
    template<typename URBG>
    Type operator() (URBG &gen)
    {
        using urbg_type = std::make_unsigned_t<typename URBG::result_type>;
        using u_type    = std::make_unsigned_t<Type>;
        using max_type  = std::conditional_t<(sizeof(urbg_type) > sizeof(u_type))
                                            , urbg_type, u_type>;

        urbg_type urbg_min = gen.min();
        urbg_type urbg_max = gen.max();
        urbg_type urbg_range = urbg_max - urbg_min;

        max_type urange = b_ - a_;
        max_type udenom = urbg_range <= urange ? 1 : urbg_range / (urange + 1);

        Type ret;
        // Note that the calculation may require more than one call to the generator
        do
            ret = (urbg_type(gen()) - urbg_min ) / udenom;
            // which is 'ret = gen / 65535' with OP's parameters
            // not a simple cast or bit shift
        while (ret > b_ - a_);
        return ret + a_;
    }
};

This could be tested HERE.

answered on Stack Overflow Jan 11, 2019 by Bob__ • edited Jan 11, 2019 by Bob__

User contributions licensed under CC BY-SA 3.0