Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Representing a uniform distribution over the range of an arbitrary enum type

I use the C++ random number utility library in quite a few places. It might not be perfectly comfortable (e.g. no base class for an arbitrary distribution), but - I've learned to live with it.

Now I happen to need to uniformly sample values from an enumerated type. I know, there's a question on that on SO already:

generating random enums

however, that one:

  1. Assumes all enum values are contiguous, i.e. it won't work for

    enum Color { Red = 1, Green = 2, Blue = 4 }
    

    where we want each of these three values to be sampled with probability 1/3.

  2. Does not provide the functionality of std::uniform_distribution<>, i.e. it doesn't work with a random engine you pass it and so on.

Obviously I can't use std::uniform_int_distribution<Color>, if only for reason 1 above. What should I do instead?

Notes:

  • The code must be generic, i.e. the enum type would be a template parameter.
  • Since it is likely I would need some instrumentation over just the rough enum, you may assume I have it; just state your assumption explicitly.
  • Specifically, and if it helps, suppose I use Better Enums, making me fully decked out with all the bells and whistles.
  • If there's somehow an idiomatic way of doing this not involving any such instrumentation, that would make for a great answer, but I doubt it.
  • C++11/14-only solutions are acceptable.
  • Multiple enum identifiers with the same value do not get double the frequency, they're just aliases of each other. If you have a simple solution assuming these do not exist, that would also be relevant, though suboptimal.
like image 878
einpoklum Avatar asked Aug 15 '16 13:08

einpoklum


3 Answers

With use of Better Enums, this problem may be resolved this way:

template<typename T>
typename T get_uniform_value(std::default_random_engine& eng)
{
    std::uniform_int_distribution<int> dist(0, T::_size() - 1);
    return T::_values()[dist(eng)];
}

Usage example:

BETTER_ENUM(Channel, int, Red, Green = 2, Blue) // Enum to generate random values of
...
std::default_random_engine rng(std::random_device{}());
Channel r = get_uniform_value<Channel>(rng); // Uniformly distributed between 0, 2 and 3
like image 124
alexeykuzmin0 Avatar answered Nov 14 '22 15:11

alexeykuzmin0


Here's three implementations of a distribution, in order of ascending complexity:

First, if we can rely on values being distinct or are OK with repeat values being overweighted we can just index the _values() container:

template<class Enum>
struct SimpleEnumDistribution
{
    std::uniform_int_distribution<typename Enum::_integral> dist{0, Enum::_size() - 1};
    template<class Generator> Enum operator()(Generator& g) { return Enum::_values()[dist(g)]; }
};

Otherwise, we can use rejection sampling, precalculating the min and max of the range of enum values:

template<class Enum>
struct UniformEnumDistribution
{
    std::uniform_int_distribution<typename Enum::_integral> dist{
        *std::min_element(Enum::_values().begin(), Enum::_values().end()),
        *std::max_element(Enum::_values().begin(), Enum::_values().end())};
    template<class Generator> Enum operator()(Generator& g)
    {
        for (;;)
            if (auto value = Enum::_from_integral_nothrow(dist(g)))
                return *value;
    }
};

If this would be inefficient (perhaps the enum values are sparse) we can compute a lookup table on initialization:

template<class Enum>
struct FastUniformEnumDistribution
{
    std::uniform_int_distribution<std::size_t> dist;
    std::array<typename Enum::_integral, Enum::_size()> values;
    FastUniformEnumDistribution()
    {
        std::copy(Enum::_values().begin(), Enum::_values().end(), values.data());
        std::sort(values.begin(), values.end());
        dist.param(std::uniform_int_distribution<std::size_t>::param_type{0u, static_cast<std::size_t>(
            std::distance(values.begin(), std::unique(values.begin(), values.end())) - 1)});
    }
    template<class Generator> Enum operator()(Generator& g)
    {
        return Enum::_from_integral_unchecked(values[dist(g)]);
    }
};

Example.

like image 2
ecatmur Avatar answered Nov 14 '22 17:11

ecatmur


I would say the more idiomatic would be to create an array and choosing index from the array:

 template <typename Rnd>
 Color RandomColor(Rnd& rnd)
 {
     const std::array<Color, 3u> colors {Color::Red, Color::Green, Color::Blue};

     std::uniform_int_distribution<int> dist(0, colors.size() - 1);
     return colors[dist(rnd)];
 }

Better Enums seems to allow to not create the array manually with Color::_values:

 template <typename BetterEnum, typename Rnd>
 BetterEnum RandomBetterEnum(Rnd& rnd)
 {
     std::uniform_int_distribution<int> dist(0, BetterEnum::_size() - 1);
     return BetterEnum::_values()[dist(rnd)];
 }
like image 1
Jarod42 Avatar answered Nov 14 '22 15:11

Jarod42