Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to select one of n objects at random without knowing n at first?

Tags:

c

algorithm

For concreteness, how would you read text line, and select and print one random line, when you don't know the number of lines in advance?

Yes this is a problem from the programming pearl which I get confused.

The solution choose the 1st element, then select the second with probability 1/2, the third with 1/3, and so forth.

An algorithm:

i = 0
while more input lines
  with probability 1.0/++i
    choice = this input line
 print choice

Suppose the final choice is the 3rd element, the probability is 1 x 1/2 x 1/3 x 3/4 x ... x n-2/n-1 x n-1/n == 1/2n ? But 1/n should be correct.

like image 408
deepsky Avatar asked Sep 01 '11 07:09

deepsky


2 Answers

Your algorithm is correct, but the analysis is not. You can prove it by induction. Loosely: It works for N = 1 of course. If it works up to N-1, then what happens at N? The chance that the Nth element is chosen and overwrites the last choice is 1/N -- good. The chance that it isn't is (N-1)/N. In which case the choice from the previous step is used. But at that point all elements had an 1/(N-1) chance of being chosen. Now it's 1/N. Good.

like image 90
Sean Owen Avatar answered Nov 16 '22 17:11

Sean Owen


Your calculation is wrong :

Suppose the final choice is the 3rd element, the probability is 1 x 1/2 x 1/3 x 3/4 x ... x n-2/n-1 x n-1/n

The real probability is :

(1 x 1/2 + 1 x 1/2) x 1/3 x 3/4 x ... x n-2/n-1 x n-1/n == 1/n

since either you chose 2 or you don't chose 2 (chosing 2 has a proba of 1/2 and not chosing 2 a proba of 1/2)

like image 34
Ricky Bobby Avatar answered Nov 16 '22 16:11

Ricky Bobby