Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Weighted random numbers

Tags:

c++

random

boost

I'm trying to implement a weighted random numbers. I'm currently just banging my head against the wall and cannot figure this out.

In my project (Hold'em hand-ranges, subjective all-in equity analysis), I'm using Boost's random -functions. So, let's say I want to pick a random number between 1 and 3 (so either 1, 2 or 3). Boost's mersenne twister generator works like a charm for this. However, I want the pick to be weighted for example like this:

1 (weight: 90) 2 (weight: 56) 3 (weight:  4)

Does Boost have some sort of functionality for this?

like image 279
nhaa123 Avatar asked Nov 19 '09 07:11

nhaa123


People also ask

What is weighted random number?

1) calculate the sum of all the weights. 2) pick a random number that is 0 or greater and is less than the sum of the weights. 3) go through the items one at a time, subtracting their weight from your random number, until you get the item where the random number is less than that item's weight.

What is a weighted random sample?

In weighted random sampling (WRS) the items are weighted and the probability of each item to be selected is determined by its relative weight.

What is weighted random distribution?

Suppose we're creating a question-answer game, and we want the questions the user got wrong previously to appear more often than the question he or she got right? This is called a Weighted Random Distribution, or sometimes Weighted Random Choice, and there are multiple methods of implementing such as random picker.

How do you generate random weights?

To generated a random number, weighted with a given probability, you can use a helper table together with a formula based on the RAND and MATCH functions. Notice, we are intentionally shifting the cumulative probability down one row, so that the value in D5 is zero.


2 Answers

There is a straightforward algorithm for picking an item at random, where items have individual weights:

1) calculate the sum of all the weights

2) pick a random number that is 0 or greater and is less than the sum of the weights

3) go through the items one at a time, subtracting their weight from your random number, until you get the item where the random number is less than that item's weight

Pseudo-code illustrating this:

int sum_of_weight = 0; for(int i=0; i<num_choices; i++) {    sum_of_weight += choice_weight[i]; } int rnd = random(sum_of_weight); for(int i=0; i<num_choices; i++) {   if(rnd < choice_weight[i])     return i;   rnd -= choice_weight[i]; } assert(!"should never get here"); 

This should be straightforward to adapt to your boost containers and such.


If your weights are rarely changed but you often pick one at random, and as long as your container is storing pointers to the objects or is more than a few dozen items long (basically, you have to profile to know if this helps or hinders), then there is an optimisation:

By storing the cumulative weight sum in each item you can use a binary search to pick the item corresponding to the pick weight.


If you do not know the number of items in the list, then there's a very neat algorithm called reservoir sampling that can be adapted to be weighted.

like image 191
Will Avatar answered Dec 26 '22 18:12

Will


Updated answer to an old question. You can easily do this in C++11 with just the std::lib:

#include <iostream> #include <random> #include <iterator> #include <ctime> #include <type_traits> #include <cassert>  int main() {     // Set up distribution     double interval[] = {1,   2,   3,   4};     double weights[] =  {  .90, .56, .04};     std::piecewise_constant_distribution<> dist(std::begin(interval),                                                 std::end(interval),                                                 std::begin(weights));     // Choose generator     std::mt19937 gen(std::time(0));  // seed as wanted     // Demonstrate with N randomly generated numbers     const unsigned N = 1000000;     // Collect number of times each random number is generated     double avg[std::extent<decltype(weights)>::value] = {0};     for (unsigned i = 0; i < N; ++i)     {         // Generate random number using gen, distributed according to dist         unsigned r = static_cast<unsigned>(dist(gen));         // Sanity check         assert(interval[0] <= r && r <= *(std::end(interval)-2));         // Save r for statistical test of distribution         avg[r - 1]++;     }     // Compute averages for distribution     for (double* i = std::begin(avg); i < std::end(avg); ++i)         *i /= N;     // Display distribution     for (unsigned i = 1; i <= std::extent<decltype(avg)>::value; ++i)         std::cout << "avg[" << i << "] = " << avg[i-1] << '\n'; } 

Output on my system:

avg[1] = 0.600115 avg[2] = 0.373341 avg[3] = 0.026544 

Note that most of the code above is devoted to just displaying and analyzing the output. The actual generation is just a few lines of code. The output demonstrates that the requested "probabilities" have been obtained. You have to divide the requested output by 1.5 since that is what the requests add up to.

like image 33
Howard Hinnant Avatar answered Dec 26 '22 20:12

Howard Hinnant