What distribution do you get from this broken random shuffle?

Tags:

The famous Fisher-Yates shuffle algorithm can be used to randomly permute an array A of length N:

For k = 1 to N     Pick a random integer j from k to N     Swap A[k] and A[j]

A common mistake that I've been told over and over again not to make is this:

For k = 1 to N     Pick a random integer j from 1 to N     Swap A[k] and A[j]

That is, instead of picking a random integer from k to N, you pick a random integer from 1 to N.

What happens if you make this mistake? I know that the resulting permutation isn't uniformly distributed, but I don't know what guarantees there are on what the resulting distribution will be. In particular, does anyone have an expression for the probability distributions over the final positions of the elements?

369

asked Feb 27 '11 03:02

templatetypedef

2 Answers

An Empirical Approach.

Let's implement the erroneous algorithm in Mathematica:

p = 10; (* Range *) s = {} For[l = 1, l <= 30000, l++, (*Iterations*)    a = Range[p];    For[k = 1, k <= p, k++,       i = RandomInteger[{1, p}];      temp = a[[k]];      a[[k]] = a[[i]];      a[[i]] = temp    ];    AppendTo[s, a]; ]

Now get the number of times each integer is in each position:

r = SortBy[#, #[[1]] &] & /@ Tally /@ Transpose[s]

Let's take three positions in the resulting arrays and plot the frequency distribution for each integer in that position:

For position 1 the freq distribution is:

enter image description here

For position 5 (middle)

enter image description here

And for position 10 (last):

enter image description here

and here you have the distribution for all positions plotted together:

enter image description here

Here you have a better statistics over 8 positions:

enter image description here

Some observations:

For all positions the probability of "1" is the same (1/n).
The probability matrix is symmetrical with respect to the big anti-diagonal
So, the probability for any number in the last position is also uniform (1/n)

You may visualize those properties looking at the starting of all lines from the same point (first property) and the last horizontal line (third property).

The second property can be seen from the following matrix representation example, where the rows are the positions, the columns are the occupant number, and the color represents the experimental probability:

enter image description here

For a 100x100 matrix:

enter image description here

Edit

Just for fun, I calculated the exact formula for the second diagonal element (the first is 1/n). The rest can be done, but it's a lot of work.

h[n_] := (n-1)/n^2 + (n-1)^(n-2) n^(-n)

Values verified from n=3 to 6 ( {8/27, 57/256, 564/3125, 7105/46656} )

Edit

Working out a little the general explicit calculation in @wnoise answer, we can get a little more info.

Replacing 1/n by p[n], so the calculations are hold unevaluated, we get for example for the first part of the matrix with n=7 (click to see a bigger image):

enter image description here

Which, after comparing with results for other values of n, let us identify some known integer sequences in the matrix:

{{  1/n,    1/n      , ...},  {... .., A007318, ....},  {... .., ... ..., ..},  ... ....,  {A129687, ... ... ... ... ... ... ..},  {A131084, A028326 ... ... ... ... ..},  {A028326, A131084 , A129687 ... ....}}

You may find those sequences (in some cases with different signs) in the wonderful http://oeis.org/

Solving the general problem is more difficult, but I hope this is a start

104

answered Oct 15 '22 17:10

Dr. belisarius

The "common mistake" you mention is shuffling by random transpositions. This problem was studied in full detail by Diaconis and Shahshahani in Generating a random permutation with random transpositions (1981). They do a complete analysis of stopping times and convergence to uniformity. If you cannot get a link to the paper, then please send me an e-mail and I can forward you a copy. It's actually a fun read (as are most of Persi Diaconis's papers).

If the array has repeated entries, then the problem is slightly different. As a shameless plug, this more general problem is addressed by myself, Diaconis and Soundararajan in Appendix B of A Rule of Thumb for Riffle Shuffling (2011).

answered Oct 15 '22 15:10

PengOne

Related questions
                            
                                Problem solving/ Algorithm Skill is a knack or can be developed with practice? [closed]
                            
                                Why not use heap sort always [duplicate]
                            
                                Rotate image and crop out black borders
                            
                                Most efficient way to see if an ArrayList contains an object in Java
                            
                                Combine Gyroscope and Accelerometer Data
                            
                                Manacher's algorithm (algorithm to find longest palindrome substring in linear time)
                            
                                Sorting an almost sorted array (elements misplaced by no more than k)
                            
                                Sparse matrices / arrays in Java
                            
                                Lazy Evaluation and Time Complexity
                            
                                Find the 2nd largest element in an array with minimum number of comparisons
                            
                                How to calculate elapsed time from now with Joda-Time?
                            
                                Generating combinations in c++
                            
                                Finding All Combinations (Cartesian product) of JavaScript array values
                            
                                Binary Trees vs. Linked Lists vs. Hash Tables
                            
                                How do I get the intersection between two arrays as a new array?
                            
                                Algorithm to find next greater permutation of a given string
                            
                                Finding height in Binary Search Tree
                            
                                Natural sort order string comparison in Java - is one built in? [duplicate]
                            
                                Generating permutations of a set (most efficiently)
                            
                                What are some algorithms for comparing how similar two strings are?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What distribution do you get from this broken random shuffle?

Tags:

language-agnostic

algorithm

random

math

shuffle

templatetypedef

People also ask

2 Answers

Dr. belisarius

PengOne

Recent Activity

Donate For Us