Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sample uniformly from multisets

Tags:

python

math

numpy

Given the set of integers {1,...,n}, I would like to sample uniformly from the binom{n+k-1}{k} distinct multi-subsets of size k. Is there an efficient way of doing this?

For example, the set {1,2,3} has 6 multi-subsets of size 2. These are {1,2}, {2,3}, {1,3}, {1,1}, {2,2}, {3,3}.

like image 908
marshall Avatar asked Oct 20 '22 18:10

marshall


1 Answers

Yes. Since you know there are (n+k-1) choose k such multi-subsets, you are probably aware of the stars and bars combinatorial problem whose solution provides that formula. The solution to that problem suggests a sampling procedure to produce multi-subsets: randomly choose a way to place k stars and n-1 bars, then determine how the bars partition the stars into groups:

import random
import collections

stars = set(random.sample(xrange(n+k-1), k))
multiset = collections.Counter()

# Don't hide the bin builtin.
bin_ = 1
for i in xrange(n+k-1):
    if i in stars:
        multiset[bin_] += 1
    else:
        bin_ += 1

This will produce a collections.Counter counting the number of times each number was chosen. I've initialized bin_ = 1 to produce a multi-subset of {1...n}; bin_ = 0 would produce a multi-subset of {0...n-1}.

(Previously, I posted an answer suggesting the use of a multinomial distribution. That is not the right distribution; it gives too little weight to results with repeated elements. Sorry for the error. Since the ways to place k stars and n-1 bars are in direct correspondence with the multi-subsets of {1...n}, this solution should produce a uniform distribution.)

like image 112
user2357112 supports Monica Avatar answered Oct 23 '22 10:10

user2357112 supports Monica