Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there an efficient way to generate N random integers in a range that have a given sum or average?

Is there an efficient way to generate a random combination of N integers such that—

  • each integer is in the interval [min, max],
  • the integers have a sum of sum,
  • the integers can appear in any order (e.g., random order), and
  • the combination is chosen uniformly at random from among all combinations that meet the other requirements?

Is there a similar algorithm for random combinations in which the integers must appear in sorted order by their values (rather than in any order)?

(Choosing an appropriate combination with a mean of mean is a special case, if sum = N * mean. This problem is equivalent to generating a uniform random partition of sum into N parts that are each in the interval [min, max] and appear in any order or in sorted order by their values, as the case may be.)

I am aware that this problem can be solved in the following way for combinations that appear in random order (EDIT [Apr. 27]: Algorithm modified.):

  1. If N * max < sum or N * min > sum, there is no solution.

  2. If N * max == sum, there is only one solution, in which all N numbers are equal to max. If N * min == sum, there is only one solution, in which all N numbers are equal to min.

  3. Use the algorithm given in Smith and Tromble ("Sampling from the Unit Simplex", 2004) to generate N random non-negative integers with the sum sum - N * min.

  4. Add min to each number generated this way.

  5. If any number is greater than max, go to step 3.

However, this algorithm is slow if max is much less than sum. For example, according to my tests (with an implementation of the special case above involving mean), the algorithm rejects, on average—

  • about 1.6 samples if N = 7, min = 3, max = 10, sum = 42, but
  • about 30.6 samples if N = 20, min = 3, max = 10, sum = 120.

Is there a way to modify this algorithm to be efficient for large N while still meeting the requirements above?

EDIT:

As an alternative suggested in the comments, an efficient way of producing a valid random combination (that satisfies all but the last requirement) is:

  1. Calculate X, the number of valid combinations possible given sum, min, and max.
  2. Choose Y, a uniform random integer in [0, X).
  3. Convert ("unrank") Y to a valid combination.

However, is there a formula for calculating the number of valid combinations (or permutations), and is there a way to convert an integer to a valid combination? [EDIT (Apr. 28): Same for permutations rather than combinations].

EDIT (Apr. 27):

After reading Devroye's Non-Uniform Random Variate Generation (1986), I can confirm that this is a problem of generating a random partition. Also, Exercise 2 (especially part E) on page 661 is relevant to this question.

EDIT (Apr. 28):

As it turned out the algorithm I gave is uniform where the integers involved are given in random order, as opposed to sorted order by their values. Since both problems are of general interest, I have modified this question to seek a canonical answer for both problems.

The following Ruby code can be used to verify potential solutions for uniformity (where algorithm(...) is the candidate algorithm):

combos={}
permus={}
mn=0
mx=6
sum=12
for x in mn..mx
  for y in mn..mx
    for z in mn..mx
      if x+y+z==sum
        permus[[x,y,z]]=0
      end
      if x+y+z==sum and x<=y and y<=z
        combos[[x,y,z]]=0
      end
    end
  end
end

3000.times {|x|
 f=algorithm(3,sum,mn,mx)
 combos[f.sort]+=1
 permus[f]+=1
}
p combos
p permus

EDIT (Apr. 29): Re-added Ruby code of current implementation.

The following code example is given in Ruby, but my question is independent of programming language:

def posintwithsum(n, total)
    raise if n <= 0 or total <=0
    ls = [0]
    ret = []
    while ls.length < n
      c = 1+rand(total-1)
      found = false
      for j in 1...ls.length
        if ls[j] == c
          found = true
          break
        end
      end
      if found == false;ls.push(c);end
    end
    ls.sort!
    ls.push(total)
    for i in 1...ls.length
       ret.push(ls[i] - ls[i - 1])
    end
    return ret
end

def integersWithSum(n, total)
 raise if n <= 0 or total <=0
 ret = posintwithsum(n, total + n)
 for i in 0...ret.length
    ret[i] = ret[i] - 1
 end
 return ret
end

# Generate 100 valid samples
mn=3
mx=10
sum=42
n=7
100.times {
 while true
    pp=integersWithSum(n,sum-n*mn).map{|x| x+mn }
    if !pp.find{|x| x>mx }
      p pp; break # Output the sample and break
    end
 end
}

like image 977
Peter O. Avatar asked Apr 23 '20 17:04

Peter O.


People also ask

How do you generate a random number from within a range?

Method 1: Using Math. random() function is used to return a floating-point pseudo-random number between range [0,1) , 0 (inclusive) and 1 (exclusive). This random number can then be scaled according to the desired range.

Can you write a random integers function to print integers within a range?

Above formula will generates a random integer in a range between min (inclusive) and max (inclusive). //Random(). nextInt(int bound) = Random integer from 0 (inclusive) to bound (exclusive) //1. nextInt(range) = nextInt(max - min) new Random().

How can I generate a random number within a range but exclude some?

After generating the random number you've to put the "holes" back in the range. This can be achieved by incrementing the generated number as long as there are excluded numbers lower than or equal to the generated one. The lower exclude numbers are "holes" in the range before the generated number.

How to generate random numbers from a range in Excel?

Next, from Excel Ribbon, go to Home > Paste. Now click on the Paste Values icon (See the screenshot). As a result, we got the numbers as values below. Now, these values won’t change on recalculation. Read More: Excel Formula to Generate Random Number (5 examples) 2. Apply RANDBETWEEN Function as Random Number Generator in a Range

Is randint () the answer to'give me a random integer'?

Since randint () is the canonical answer for "give me a random integer" in Python, I decided to dig deeper to understand what's going on. This is a brief post that dives into the implementation of the random module, and discusses some alternative methods for generating pseudo-random integers. First, a basic benchmark (Python 3.6): Whoa!

How to get random number list between 3 and 10?

For instance, if you want to get a random number (whole number) list between 3 and 10 then follow the below steps. Go to the corresponding excel sheet, right-click on the sheet name and click on the View Code option to bring up the VBA window. Type the below code in the Module.

Why does NumPy take so long to generate random integers?

Finally, we can turn away from the random module altogether, and use Numpy: Surprisingly, this is slow; that's because Numpy isn't great for working with single datums - it likes to amortize costs over large arrays created / manipulated in C. To see this in action, let's see how long it takes to generate 100 random integers:


4 Answers

Here's my solution in Java. It is fully functional and contains two generators: PermutationPartitionGenerator for unsorted partitions and CombinationPartitionGenerator for sorted partitions. Your generator also implemented in the class SmithTromblePartitionGenerator for comparison. The class SequentialEnumerator enumerates all possible partitions (unsorted or sorted, depending on the parameter) in sequential order. I have added thorough tests (including your test cases) for all of these generators. The implementation is self-explainable for the most part. If you have any questions, I will answer them in couple of days.

import java.util.Random;
import java.util.function.Supplier;

public abstract class PartitionGenerator implements Supplier<int[]>{
    public static final Random rand = new Random();
    protected final int numberCount;
    protected final int min;
    protected final int range;
    protected final int sum; // shifted sum
    protected final boolean sorted;

    protected PartitionGenerator(int numberCount, int min, int max, int sum, boolean sorted) {
        if (numberCount <= 0)
            throw new IllegalArgumentException("Number count should be positive");
        this.numberCount = numberCount;
        this.min = min;
        range = max - min;
        if (range < 0)
            throw new IllegalArgumentException("min > max");
        sum -= numberCount * min;
        if (sum < 0)
            throw new IllegalArgumentException("Sum is too small");
        if (numberCount * range < sum)
            throw new IllegalArgumentException("Sum is too large");
        this.sum = sum;
        this.sorted = sorted;
    }

    // Whether this generator returns sorted arrays (i.e. combinations)
    public final boolean isSorted() {
        return sorted;
    }

    public interface GeneratorFactory {
        PartitionGenerator create(int numberCount, int min, int max, int sum);
    }
}

import java.math.BigInteger;

// Permutations with repetition (i.e. unsorted vectors) with given sum
public class PermutationPartitionGenerator extends PartitionGenerator {
    private final double[][] distributionTable;

    public PermutationPartitionGenerator(int numberCount, int min, int max, int sum) {
        super(numberCount, min, max, sum, false);
        distributionTable = calculateSolutionCountTable();
    }

    private double[][] calculateSolutionCountTable() {
        double[][] table = new double[numberCount + 1][sum + 1];
        BigInteger[] a = new BigInteger[sum + 1];
        BigInteger[] b = new BigInteger[sum + 1];
        for (int i = 1; i <= sum; i++)
            a[i] = BigInteger.ZERO;
        a[0] = BigInteger.ONE;
        table[0][0] = 1.0;
        for (int n = 1; n <= numberCount; n++) {
            double[] t = table[n];
            for (int s = 0; s <= sum; s++) {
                BigInteger z = BigInteger.ZERO;
                for (int i = Math.max(0, s - range); i <= s; i++)
                    z = z.add(a[i]);
                b[s] = z;
                t[s] = z.doubleValue();
            }
            // swap a and b
            BigInteger[] c = b;
            b = a;
            a = c;
        }
        return table;
    }

    @Override
    public int[] get() {
        int[] p = new int[numberCount];
        int s = sum; // current sum
        for (int i = numberCount - 1; i >= 0; i--) {
            double t = rand.nextDouble() * distributionTable[i + 1][s];
            double[] tableRow = distributionTable[i];
            int oldSum = s;
            // lowerBound is introduced only for safety, it shouldn't be crossed 
            int lowerBound = s - range;
            if (lowerBound < 0)
                lowerBound = 0;
            s++;
            do
                t -= tableRow[--s];
            // s can be equal to lowerBound here with t > 0 only due to imprecise subtraction
            while (t > 0 && s > lowerBound);
            p[i] = min + (oldSum - s);
        }
        assert s == 0;
        return p;
    }

    public static final GeneratorFactory factory = (numberCount, min, max,sum) ->
        new PermutationPartitionGenerator(numberCount, min, max, sum);
}

import java.math.BigInteger;

// Combinations with repetition (i.e. sorted vectors) with given sum 
public class CombinationPartitionGenerator extends PartitionGenerator {
    private final double[][][] distributionTable;

    public CombinationPartitionGenerator(int numberCount, int min, int max, int sum) {
        super(numberCount, min, max, sum, true);
        distributionTable = calculateSolutionCountTable();
    }

    private double[][][] calculateSolutionCountTable() {
        double[][][] table = new double[numberCount + 1][range + 1][sum + 1];
        BigInteger[][] a = new BigInteger[range + 1][sum + 1];
        BigInteger[][] b = new BigInteger[range + 1][sum + 1];
        double[][] t = table[0];
        for (int m = 0; m <= range; m++) {
            a[m][0] = BigInteger.ONE;
            t[m][0] = 1.0;
            for (int s = 1; s <= sum; s++) {
                a[m][s] = BigInteger.ZERO;
                t[m][s] = 0.0;
            }
        }
        for (int n = 1; n <= numberCount; n++) {
            t = table[n];
            for (int m = 0; m <= range; m++)
                for (int s = 0; s <= sum; s++) {
                    BigInteger z;
                    if (m == 0)
                        z = a[0][s];
                    else {
                        z = b[m - 1][s];
                        if (m <= s)
                            z = z.add(a[m][s - m]);
                    }
                    b[m][s] = z;
                    t[m][s] = z.doubleValue();
                }
            // swap a and b
            BigInteger[][] c = b;
            b = a;
            a = c;
        }
        return table;
    }

    @Override
    public int[] get() {
        int[] p = new int[numberCount];
        int m = range; // current max
        int s = sum; // current sum
        for (int i = numberCount - 1; i >= 0; i--) {
            double t = rand.nextDouble() * distributionTable[i + 1][m][s];
            double[][] tableCut = distributionTable[i];
            if (s < m)
                m = s;
            s -= m;
            while (true) {
                t -= tableCut[m][s];
                // m can be 0 here with t > 0 only due to imprecise subtraction
                if (t <= 0 || m == 0)
                    break;
                m--;
                s++;
            }
            p[i] = min + m;
        }
        assert s == 0;
        return p;
    }

    public static final GeneratorFactory factory = (numberCount, min, max, sum) ->
        new CombinationPartitionGenerator(numberCount, min, max, sum);
}

import java.util.*;

public class SmithTromblePartitionGenerator extends PartitionGenerator {
    public SmithTromblePartitionGenerator(int numberCount, int min, int max, int sum) {
        super(numberCount, min, max, sum, false);
    }

    @Override
    public int[] get() {
        List<Integer> ls = new ArrayList<>(numberCount + 1);
        int[] ret = new int[numberCount];
        int increasedSum = sum + numberCount;
        while (true) {
            ls.add(0);
            while (ls.size() < numberCount) {
                int c = 1 + rand.nextInt(increasedSum - 1);
                if (!ls.contains(c))
                    ls.add(c);
            }
            Collections.sort(ls);
            ls.add(increasedSum);
            boolean good = true;
            for (int i = 0; i < numberCount; i++) {
                int x = ls.get(i + 1) - ls.get(i) - 1;
                if (x > range) {
                    good = false;
                    break;
                }
                ret[i] = x;
            }
            if (good) {
                for (int i = 0; i < numberCount; i++)
                    ret[i] += min;
                return ret;
            }
            ls.clear();
        }
    }

    public static final GeneratorFactory factory = (numberCount, min, max, sum) ->
        new SmithTromblePartitionGenerator(numberCount, min, max, sum);
}

import java.util.Arrays;

// Enumerates all partitions with given parameters
public class SequentialEnumerator extends PartitionGenerator {
    private final int max;
    private final int[] p;
    private boolean finished;

    public SequentialEnumerator(int numberCount, int min, int max, int sum, boolean sorted) {
        super(numberCount, min, max, sum, sorted);
        this.max = max;
        p = new int[numberCount];
        startOver();
    }

    private void startOver() {
        finished = false;
        int unshiftedSum = sum + numberCount * min;
        fillMinimal(0, Math.max(min, unshiftedSum - (numberCount - 1) * max), unshiftedSum);
    }

    private void fillMinimal(int beginIndex, int minValue, int fillSum) {
        int fillRange = max - minValue;
        if (fillRange == 0)
            Arrays.fill(p, beginIndex, numberCount, max);
        else {
            int fillCount = numberCount - beginIndex;
            fillSum -= fillCount * minValue;
            int maxCount = fillSum / fillRange;
            int maxStartIndex = numberCount - maxCount;
            Arrays.fill(p, maxStartIndex, numberCount, max);
            fillSum -= maxCount * fillRange;
            Arrays.fill(p, beginIndex, maxStartIndex, minValue);
            if (fillSum != 0)
                p[maxStartIndex - 1] = minValue + fillSum;
        }
    }

    @Override
    public int[] get() { // returns null when there is no more partition, then starts over
        if (finished) {
            startOver();
            return null;
        }
        int[] pCopy = p.clone();
        if (numberCount > 1) {
            int i = numberCount;
            int s = p[--i];
            while (i > 0) {
                int x = p[--i];
                if (x == max) {
                    s += x;
                    continue;
                }
                x++;
                s--;
                int minRest = sorted ? x : min;
                if (s < minRest * (numberCount - i - 1)) {
                    s += x;
                    continue;
                }
                p[i++]++;
                fillMinimal(i, minRest, s);
                return pCopy;
            }
        }
        finished = true;
        return pCopy;
    }

    public static final GeneratorFactory permutationFactory = (numberCount, min, max, sum) ->
        new SequentialEnumerator(numberCount, min, max, sum, false);
    public static final GeneratorFactory combinationFactory = (numberCount, min, max, sum) ->
        new SequentialEnumerator(numberCount, min, max, sum, true);
}

import java.util.*;
import java.util.function.BiConsumer;
import PartitionGenerator.GeneratorFactory;

public class Test {
    private final int numberCount;
    private final int min;
    private final int max;
    private final int sum;
    private final int repeatCount;
    private final BiConsumer<PartitionGenerator, Test> procedure;

    public Test(int numberCount, int min, int max, int sum, int repeatCount,
            BiConsumer<PartitionGenerator, Test> procedure) {
        this.numberCount = numberCount;
        this.min = min;
        this.max = max;
        this.sum = sum;
        this.repeatCount = repeatCount;
        this.procedure = procedure;
    }

    @Override
    public String toString() {
        return String.format("=== %d numbers from [%d, %d] with sum %d, %d iterations ===",
                numberCount, min, max, sum, repeatCount);
    }

    private static class GeneratedVector {
        final int[] v;

        GeneratedVector(int[] vect) {
            v = vect;
        }

        @Override
        public int hashCode() {
            return Arrays.hashCode(v);
        }

        @Override
        public boolean equals(Object obj) {
            if (this == obj)
                return true;
            return Arrays.equals(v, ((GeneratedVector)obj).v);
        }

        @Override
        public String toString() {
            return Arrays.toString(v);
        }
    }

    private static final Comparator<Map.Entry<GeneratedVector, Integer>> lexicographical = (e1, e2) -> {
        int[] v1 = e1.getKey().v;
        int[] v2 = e2.getKey().v;
        int len = v1.length;
        int d = len - v2.length;
        if (d != 0)
            return d;
        for (int i = 0; i < len; i++) {
            d = v1[i] - v2[i];
            if (d != 0)
                return d;
        }
        return 0;
    };

    private static final Comparator<Map.Entry<GeneratedVector, Integer>> byCount =
            Comparator.<Map.Entry<GeneratedVector, Integer>>comparingInt(Map.Entry::getValue)
            .thenComparing(lexicographical);

    public static int SHOW_MISSING_LIMIT = 10;

    private static void checkMissingPartitions(Map<GeneratedVector, Integer> map, PartitionGenerator reference) {
        int missingCount = 0;
        while (true) {
            int[] v = reference.get();
            if (v == null)
                break;
            GeneratedVector gv = new GeneratedVector(v);
            if (!map.containsKey(gv)) {
                if (missingCount == 0)
                    System.out.println(" Missing:");
                if (++missingCount > SHOW_MISSING_LIMIT) {
                    System.out.println("  . . .");
                    break;
                }
                System.out.println(gv);
            }
        }
    }

    public static final BiConsumer<PartitionGenerator, Test> distributionTest(boolean sortByCount) {
        return (PartitionGenerator gen, Test test) -> {
            System.out.print("\n" + getName(gen) + "\n\n");
            Map<GeneratedVector, Integer> combos = new HashMap<>();
            // There's no point of checking permus for sorted generators
            // because they are the same as combos for them
            Map<GeneratedVector, Integer> permus = gen.isSorted() ? null : new HashMap<>();
            for (int i = 0; i < test.repeatCount; i++) {
                int[] v = gen.get();
                if (v == null && gen instanceof SequentialEnumerator)
                    break;
                if (permus != null) {
                    permus.merge(new GeneratedVector(v), 1, Integer::sum);
                    v = v.clone();
                    Arrays.sort(v);
                }
                combos.merge(new GeneratedVector(v), 1, Integer::sum);
            }
            Set<Map.Entry<GeneratedVector, Integer>> sortedEntries = new TreeSet<>(
                    sortByCount ? byCount : lexicographical);
            System.out.println("Combos" + (gen.isSorted() ? ":" : " (don't have to be uniform):"));
            sortedEntries.addAll(combos.entrySet());
            for (Map.Entry<GeneratedVector, Integer> e : sortedEntries)
                System.out.println(e);
            checkMissingPartitions(combos, test.getGenerator(SequentialEnumerator.combinationFactory));
            if (permus != null) {
                System.out.println("\nPermus:");
                sortedEntries.clear();
                sortedEntries.addAll(permus.entrySet());
                for (Map.Entry<GeneratedVector, Integer> e : sortedEntries)
                    System.out.println(e);
                checkMissingPartitions(permus, test.getGenerator(SequentialEnumerator.permutationFactory));
            }
        };
    }

    public static final BiConsumer<PartitionGenerator, Test> correctnessTest =
        (PartitionGenerator gen, Test test) -> {
        String genName = getName(gen);
        for (int i = 0; i < test.repeatCount; i++) {
            int[] v = gen.get();
            if (v == null && gen instanceof SequentialEnumerator)
                v = gen.get();
            if (v.length != test.numberCount)
                throw new RuntimeException(genName + ": array of wrong length");
            int s = 0;
            if (gen.isSorted()) {
                if (v[0] < test.min || v[v.length - 1] > test.max)
                    throw new RuntimeException(genName + ": generated number is out of range");
                int prev = test.min;
                for (int x : v) {
                    if (x < prev)
                        throw new RuntimeException(genName + ": unsorted array");
                    s += x;
                    prev = x;
                }
            } else
                for (int x : v) {
                    if (x < test.min || x > test.max)
                        throw new RuntimeException(genName + ": generated number is out of range");
                    s += x;
                }
            if (s != test.sum)
                throw new RuntimeException(genName + ": wrong sum");
        }
        System.out.format("%30s :   correctness test passed%n", genName);
    };

    public static final BiConsumer<PartitionGenerator, Test> performanceTest =
        (PartitionGenerator gen, Test test) -> {
        long time = System.nanoTime();
        for (int i = 0; i < test.repeatCount; i++)
            gen.get();
        time = System.nanoTime() - time;
        System.out.format("%30s : %8.3f s %10.0f ns/test%n", getName(gen), time * 1e-9, time * 1.0 / test.repeatCount);
    };

    public PartitionGenerator getGenerator(GeneratorFactory factory) {
        return factory.create(numberCount, min, max, sum);
    }

    public static String getName(PartitionGenerator gen) {
        String name = gen.getClass().getSimpleName();
        if (gen instanceof SequentialEnumerator)
            return (gen.isSorted() ? "Sorted " : "Unsorted ") + name;
        else
            return name;
    }

    public static GeneratorFactory[] factories = { SmithTromblePartitionGenerator.factory,
            PermutationPartitionGenerator.factory, CombinationPartitionGenerator.factory,
            SequentialEnumerator.permutationFactory, SequentialEnumerator.combinationFactory };

    public static void main(String[] args) {
        Test[] tests = {
                            new Test(3, 0, 3, 5, 3_000, distributionTest(false)),
                            new Test(3, 0, 6, 12, 3_000, distributionTest(true)),
                            new Test(50, -10, 20, 70, 2_000, correctnessTest),
                            new Test(7, 3, 10, 42, 1_000_000, performanceTest),
                            new Test(20, 3, 10, 120, 100_000, performanceTest)
                       };
        for (Test t : tests) {
            System.out.println(t);
            for (GeneratorFactory factory : factories) {
                PartitionGenerator candidate = t.getGenerator(factory);
                t.procedure.accept(candidate, t);
            }
            System.out.println();
        }
    }
}

You can try this on Ideone.

like image 176
John McClane Avatar answered Oct 21 '22 20:10

John McClane


Here is the algorithm from John McClane's PermutationPartitionGenerator, in another answer on this page. It has two phases, namely a setup phase and a sampling phase, and generates n random variates in [min, max] with the sum sum, where the numbers are listed in random order.

Setup phase: First, a solution table is built using the following formulas (t(y, x) where y is in [0, n] and x is in [0, sum - n * min]):

  • t(0, j) = 1 if j == 0; 0 otherwise
  • t(i, j) = t(i-1, j) + t(i-1, j-1) + ... + t(i-1, j-(max-min))

Here, t(y, x) stores the relative probability that the sum of y numbers (in the appropriate range) will equal x. This probability is relative to all t(y, x) with the same y.

Sampling phase: Here we generate a sample of n numbers. Set s to sum - n * min, then for each position i, starting with n - 1 and working backwards to 0:

  • Set v to a uniform random integer in [0, t(i+1, s)).
  • Set r to min.
  • Subtract t(i, s) from v.
  • While v remains 0 or greater, subtract t(i, s-1) from v, add 1 to r, and subtract 1 from s.
  • The number at position i in the sample is set to r.

EDIT:

It appears that with trivial changes to the algorithm above, it's possible to have each random variate use a separate range rather than use the same range for all of them:

Each random variate at positions i ∈ [0, n) has a minimum value min(i) and a maximum value max(i).

Let adjsum = sum - ∑min(i).

Setup phase: First, a solution table is built using the following formulas (t(y, x) where y is in [0, n] and x is in [0, adjsum]):

  • t(0, j) = 1 if j == 0; 0 otherwise
  • t(i, j) = t(i-1, j) + t(i-1, j-1) + ... + t(i-1, j-(max(i-1)-min(i-1)))

The sampling phase is then exactly the same as before, except we set s to adjsum (rather than sum - n * min) and set r to min(i) (rather than min).


EDIT:

For John McClane's CombinationPartitionGenerator, the setup and sampling phases are as follows.

Setup phase: First, a solution table is built using the following formulas (t(z, y, x) where z is in [0, n], y is in [0, max - min], and x is in [0, sum - n * min]):

  • t(0, j, k) = 1 if k == 0; 0 otherwise
  • t(i, 0, k) = t(i - 1, 0, k)
  • t(i, j, k) = t(i, j-1, k) + t(i - 1, j, k - j)

Sampling phase: Here we generate a sample of n numbers. Set s to sum - n * min and mrange to max - min, then for each position i, starting with n - 1 and working backwards to 0:

  • Set v to a uniform random integer in [0, t(i+1, mrange, s)).
  • Set mrange to min(mrange, s)
  • Subtract mrange from s.
  • Set r to min + mrange.
  • Subtract t(i, mrange, s) from v.
  • While v remains 0 or greater, add 1 to s, subtract 1 from r and 1 from mrange, then subtract t(i, mrange, s) from v.
  • The number at position i in the sample is set to r.
like image 32
Peter O. Avatar answered Oct 21 '22 18:10

Peter O.


I have not tested this, so it is not really an answer, just something to try which is too long to fit into a comment. Start with an array which meets the first two criteria and play with it so it still meets the first two, but is a lot more random.

If the mean is an integer, then your initial array can be [4, 4, 4, ... 4] or maybe [3, 4, 5, 3, 4, 5, ... 5, 8, 0] or something simple like that. For a mean of 4.5, try [4, 5, 4, 5, ... 4, 5].

Next pick a pair of numbers, num1 and num2, in the array. Probably the first number should be taken in order, as with the Fisher-Yates shuffle, the second number should be picked at random. Taking the first number in order ensures that every number is picked at least once.

Now calculate max-num1 and num2-min. Those are the distances from the two numbers to the max and min boundaries. Set limit to the smaller of the two distances. That is the maximum change allowed which will not put one or other of the numbers outside the allowed limits. If limit is zero then skip this pair.

Pick a random integer in the range [1, limit]: call it change. I omit 0 from the pickable range as it has no effect. Testing may show that you get better randomness by including it; I'm not sure.

Now set num1 <- num1 + change and num2 <- num2 - change. That will not affect the mean value and all elements of the array are still within the required boundaries.

You will need to run through the whole array at least once. Testing should show if you need to run through it more than once to get something sufficiently random.

ETA: include pseudocode

// Set up the array.
resultAry <- new array size N
for (i <- 0 to N-1)
  // More complex initial setup schemes are possible here.
  resultAry[i] <- mean
rof

// Munge the array entries.
for (ix1 <- 0 to N-1)  // ix1 steps through the array in order.

  // Pick second entry different from first.
  repeat
    ix2 <- random(0, N-1)
  until (ix2 != ix1)

  // Calculate size of allowed change.
  hiLimit <- max - resultAry[ix1]
  loLimit <- resultAry[ix2] - min
  limit <- minimum(hiLimit, loLimit)
  if (limit == 0)
    // No change possible so skip.
    continue loop with next ix1
  fi

  // Change the two entries keeping same mean.
  change <- random(1, limit)  // Or (0, limit) possibly.
  resultAry[ix1] <- resultAry[ix1] + change
  resultAry[ix2] <- resultAry[ix2] - change

rof

// Check array has been sufficiently munged.
if (resultAry not random enough)
  munge the array again
fi
like image 32
rossum Avatar answered Oct 21 '22 19:10

rossum


As the OP points out, the ability to efficiently unrank is very powerful. If we are able to do so, generating a uniform distribution of partitions can be done in three steps (restating what the OP has laid out in the question):

  1. Calculate the total number, M, of partitions of length N of the number sum such that the parts are in the range [min, max].
  2. Generate a uniform distribution of integers from [1, M].
  3. Unrank each integer from step 2 into its respective partition.

Below, we only focus on generating the nth partition as there is a copious amount of information on generating a uniform distribution of integer in a given range. Here is a simple C++ unranking algorithm which should be easy to translate to other languages (N.B. I have not figured out how to unrank the composition case yet (i.e. order matters)).

std::vector<int> unRank(int n, int m, int myMax, int nth) {

    std::vector<int> z(m, 0);
    int count = 0;
    int j = 0;

    for (int i = 0; i < z.size(); ++i) {
        int temp = pCount(n - 1, m - 1, myMax);

        for (int r = n - m, k = myMax - 1;
             (count + temp) < nth && r > 0 && k; r -= m, --k) {

            count += temp;
            n = r;
            myMax = k;
            ++j;
            temp = pCount(n - 1, m - 1, myMax);
        }

        --m;
        --n;
        z[i] = j;
    }

    return z;
}

The workhorse pCount function is given by:

int pCount(int n, int m, int myMax) {

    if (myMax * m < n) return 0;
    if (myMax * m == n) return 1;

    if (m < 2) return m;
    if (n < m) return 0;
    if (n <= m + 1) return 1;

    int niter = n / m;
    int count = 0;

    for (; niter--; n -= m, --myMax) {
        count += pCount(n - 1, m - 1, myMax);
    }

    return count;
}

This function is based off of the excellent answer to Is there an efficient algorithm for integer partitioning with restricted number of parts? by user @m69_snarky_and_unwelcoming. The one given above is a slight modification of the simple algorithm (the one without memoization). This can easily be modified to incorporate memoization for greater efficiency. We will leave this off for now and focus on the unranking portion.

Explanation of unRank

We first note there is a one-to-one mapping from the partitions of length N of the number sum such that the parts are in the range [min, max] to the restricted partitions of length N of the number sum - N * (min - 1) with parts in [1, max - (min - 1)].

As a small example, consider the partitions of 50 of length 4 such that the min = 10 and the max = 15. This will have the same structure as the restricted partitions of 50 - 4 * (10 - 1) = 14 of length 4 with the maximum part equal to 15 - (10 - 1) = 6.

10   10   15   15   --->>    1    1    6    6
10   11   14   15   --->>    1    2    5    6
10   12   13   15   --->>    1    3    4    6
10   12   14   14   --->>    1    3    5    5
10   13   13   14   --->>    1    4    4    5
11   11   13   15   --->>    2    2    4    6
11   11   14   14   --->>    2    2    5    5
11   12   12   15   --->>    2    3    3    6
11   12   13   14   --->>    2    3    4    5
11   13   13   13   --->>    2    4    4    4
12   12   12   14   --->>    3    3    3    5
12   12   13   13   --->>    3    3    4    4

With this in mind, in order to easily count, we could add a step 1a to translate the problem to the "unit" case if you will.

Now, we simply have a counting problem. As @m69 brilliantly displays, counting partitions can be easily achieved by breaking the problem into smaller problems. The function @m69 provides gets us 90% of the way, we just have to figure out what to do with the added restriction that there is a cap. This is where we get:

int pCount(int n, int m, int myMax) {

    if (myMax * m < n) return 0;
    if (myMax * m == n) return 1;

We also have to keep in mind that myMax will decrease as we move along. This makes sense if we look at the 6th partition above:

2   2   4   6

In order to count the number of partitions from here on out, we must keep applying the translation to the "unit" case. This looks like:

1   1   3   5

Where as the step before, we had a max of 6, now we only consider a max of 5.

With this in mind, unranking the partition is no different than unranking a standard permutation or combination. We must be able to count the number of partitions in a given section. For example, to count the number of partitions that start with 10 above, all we do is remove the 10 in the first column:

10   10   15   15
10   11   14   15
10   12   13   15
10   12   14   14
10   13   13   14

10   15   15
11   14   15
12   13   15
12   14   14
13   13   14

Translate to the unit case:

1   6   6
2   5   6
3   4   6
3   5   5
4   4   5

and call pCount:

pCount(13, 3, 6) = 5

Given a random integer to unrank, we continue calculating the number of partitions in smaller and smaller sections (as we did above) until we have filled our index vector.

Examples

Given min = 3, max = 10, n = 7, and sum = 42, here is an ideone demo that generates 20 random partitions. The output is below:

42: 3 3 6 7 7 8 8 
123: 4 4 6 6 6 7 9 
2: 3 3 3 4 9 10 10 
125: 4 4 6 6 7 7 8 
104: 4 4 4 6 6 8 10 
74: 3 4 6 7 7 7 8 
47: 3 4 4 5 6 10 10 
146: 5 5 5 5 6 7 9 
70: 3 4 6 6 6 7 10 
134: 4 5 5 6 6 7 9 
136: 4 5 5 6 7 7 8 
81: 3 5 5 5 8 8 8 
122: 4 4 6 6 6 6 10 
112: 4 4 5 5 6 8 10 
147: 5 5 5 5 6 8 8 
142: 4 6 6 6 6 7 7 
37: 3 3 6 6 6 9 9 
67: 3 4 5 6 8 8 8 
45: 3 4 4 4 8 9 10 
44: 3 4 4 4 7 10 10

The lexicographical index is on the left and the unranked partition in on the right.

like image 24
Joseph Wood Avatar answered Oct 21 '22 19:10

Joseph Wood