I need to distribute a large integer budget
randomly among a small array with n
elements, so that all elements in the array will have the same distribution and sum up to budget
and each element in the array gets at least min
.
I have an algorithm that runs in O(budget):
private int[] distribute(int budget, int n, int min) {
int[] subBudgets = new int[n];
for (int i = 0; i < n; i++) {
subBudgets[i] = min;
}
budget -= n * min;
while (budget > 0) {
subBudgets[random.nextInt(n)]++;
budget--;
}
return subBudgets;
}
However, when budget
increases, it can be very expensive. Is there any algorithm that runs in O(n) or even better?
First generate n
random numbers x[i]
, sum them up and then divide budget
by the sum and you will get k
. Then assign k*x[i]
to each array element. It is simple and O(n).
If you want there at least min
value in each element you can modify above algorithm by filling all elements by min
(or use k*x[i] + min
) and subcontracting n*min
from budget
before starting above algorithm.
If you need working with integers you can approach problem by using real value k
and rounding k*x[i]
. Then you have to track accumulating rounding error and add or subtract accumulated error from calculated value if it reach whole unit. You have to also assign remaining value into last element to reach whole budget
.
P.S.: Note this algorithm can be used with easy in pure functional languages. It is reason why I like this whole family of algorithms generating random numbers for each member and then do some processing afterward. Example of implementation in Erlang:
-module(budget).
-export([distribute/2, distribute/3]).
distribute(Budget, N) ->
distribute(Budget, N, 0).
distribute(Budget, N, Min) when
is_integer(Budget), is_integer(N), N > 0,
is_integer(Min), Min >= 0, Budget >= N*Min ->
Xs = [random:uniform() || _ <- lists:seq(1,N) ],
Rest = Budget - N*Min,
K = Rest / lists:sum(Xs),
F = fun(X, {Bgt, Err, Acc}) ->
Y = X*K + Err,
Z = round(Y),
{Bgt - Z, Y - Z, [Z + Min | Acc]}
end,
{Bgt, _, T} = lists:foldl(F, {Rest, 0.0, []}, tl(Xs)),
[Bgt + Min | T].
Same algorithm in C++ (?? I dunno.)
private int[] distribute(int budget, int n, int min) {
int[] subBudgets = new int[n];
double[] rands = new double[n];
double k, err = 0, sum = 0;
budget -= n * min;
for (int i = 0; i < n; i++) {
rands[i] = random.nextDouble();
sum += rands[i];
}
k = (double)budget/sum;
for (int i = 1; i < n; i++) {
double y = k*rands[i] + err;
int z = floor(y+0.5);
subBudgets[i] = min + z;
budget -= z;
err = y - z;
}
subBudgets[0] = min + budget;
return subBudgets;
}
The way that you are currently distributing the dollars left over after min
has been given to each subbudget involves performing a fixed number budget
of random "trials", where on each trial you randomly select one of n
categories, and you want to know how many times each category is selected. This is modeled by a multinomial distribution with the following parameters:
n
on the WP page): budget
k
on the WP page): n
i
in each trial, for 1 <= i <= n
: 1/n
The way you are currently doing it is a good way if the number of trials is around the same size as the number of categories, or less. But if the budget is large, there are other more efficient ways of sampling from this distribution. The easiest way I know of is to notice that a multinomial distribution with k
categories can be repeatedly decomposed into binomial distributions by grouping categories together: instead of directly how many selections there are for each of the k
categories, we express this as a sequence of questions: "How to split the budget between the first category and the other k-1
?" We next ask "How to split the remainder between the second category and the other k-2
?", etc.
So the top level binomial has category (subbudget) 1 vs. everything else. Decide the number of dollars that go to subbudget 1 by taking 1 sample from a binomial distribution with parameters n = budget
and p = 1/n
(how to do this is described here); this will produce some number 0 <= x[1] <= n
. To find the number of dollars that go to subbudget 2, take 1 sample from a binomial distribution on the remaining money, i.e. using parameters n = budget - x[1]
and p = 1/(n-1)
. After getting subbudget 2's amount x[2], subbudget 3's will be found using parameters n = budget - x[1] - x[2]
and p = 1/(n-2)
, and so on.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With