Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

selection of numbers from a set with equal probability

Got this question from algorithms design manual by Steven Skiena.

It is required to select k (value given) numbers to form a subset S' from a given set S having n numbers, such that selection probability for each number is equal (k/n). n is unknown (i was thinking of taking S as a link-list for this). also, we can have only pass through the set S.

like image 319
S..K Avatar asked Jan 19 '23 01:01

S..K


2 Answers

When n is unknown you'd rather need an on-line algorithm for so-called Reservoir sampling.

The good explanation & proof sketches are provided here http://propersubset.com/2010/04/choosing-random-elements.html

I mean this algorithm implemented in Python (taken from the link above)

import random
def random_subset( iterator, K ):
    result = []
    N = 0

    for item in iterator:
        N += 1
        if len( result ) < K:
            result.append( item )
        else:
            s = int(random.random() * N)
            if s < K:
                result[ s ] = item

    return result
like image 148
Alec Avatar answered Jan 28 '23 22:01

Alec


Something like this

for elem in S
  if random() < (k - S'.size)/S.size // This is float division
    S'.add(elem)

The first element is chosen with probability k/n, the second one with (n-k)/n * k/(n-1) + k/n * (k-1)/(n-1) which reduces to k/n, etc.

like image 44
Paweł Obrok Avatar answered Jan 28 '23 21:01

Paweł Obrok