Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Random Shuffling in Java (or any language) Probabilities [duplicate]

So, I'm watching Robert Sedgewick's videos on Coursera and I am currently at the Shuffling one. He's showing a "poorly written" shuffling code for online poker (it has a few other bugs, which I've removed because they are not related to my question) This is how the algorithm works:

for(int i = 0; i < N; i++)
int r = new Random().nextInt(53);
swap(cardArray, i, r);

It iterates all the cards once. At each iteration a random number is generated and the i-th card is swapped with the r-th card. Simple, right?

While I understand the algorithm, I didn't understand his probability calculation. He said that because Random uses a 32-bit seed (or 64, it doesn't seem to matter), this is constrained to only 2^32 different permutations.

He also said that Knuth's algorithm is better (same for loop, but choose a number between 1 and i) because it gives you N! permutations.

I can agree with Knuth's algorithm calculations. But I think that on the first one (which is supposed to be the faulty one) there should be N^N different permutations.

Is Sedgewick wrong or am I missing a fact?

like image 869
Mackiavelli Avatar asked Mar 20 '15 18:03

Mackiavelli


1 Answers

Sedgewick's way of explaining it seems very strange and obtuse to me.

Imagine you had a deck of only 3 cards and applied the algorithm shown.

After the first card was swapped there would be 3 possible outcomes. After the second, 9. And after the 3rd swap, 27. Thus, we know that using the swapping algorithm we will have 27 different possible outcomes, some of which will be duplicate outcomes to the others.

Now, we know for a fact that there are 3 * 2 * 1 = 6 possible arrangements of a 3-card deck. However, 27 is NOT divisible by 6. Therefore, we know for a fact that some arrangements will be more common than others, even without computing what they are. Therefore, the swapping algorithm will not result in an equal probability among the 6 possibilities, i.e., it will be biased towards certain arrangements.

The same exact logic extends to the case of 52 cards.

We can investigate which arrangements are preferred by looking at the distribution of outcomes in the three-card case, which are:

1 2 3   5 occurrences
1 3 2   5 occurrences
2 1 3   4 occurrences
2 3 1   4 occurrences
3 1 2   4 occurrences
3 2 1   5 occurrences

Total 27

Examining these, we notice that combinations which require 0 or 1 swaps have more occurrences than combinations that require 2 swaps. In general, the fewer the number of swaps required for the combination, the more likely it is.

like image 134
Tyler Durden Avatar answered Nov 09 '22 17:11

Tyler Durden