The classic Fisher Yates looks something like this:
void shuffle1(std::vector<int>& vec)
{
int n = vec.size();
for (int i = n - 1; i > 0; --i)
{
std::swap(vec[i], vec[rand() % (i + 1)]);
}
}
Yesterday, I implemented the iteration "backwards" by mistake:
void shuffle2(std::vector<int>& vec)
{
int n = vec.size();
for (int i = 1; i < n; ++i)
{
std::swap(vec[i], vec[rand() % (i + 1)]);
}
}
Is this version in any way worse (or better) than the first? Does it skew the resulting probabilities?
Yes it's even distribution assuming rand()
is. We will prove this by showing that each input can generate each permutation with equal probability.
N=2 can be easily proven. We will draw it as a tree where the the children represent each string you can get by inserting the character after comma into the leftmost string.
0,1 //input where 0,1 represent indices
01 10 //output. Represents permutations of 01. It is clear that each one has equal probability
For N, we will have every permutations for N-1, and randomly swapping the last character for N
(N-1 0th permutation),N ..... (N-1 Ith permutation),N ________________________
/ \ / \ \
0th permutation of N 1st permutation.... (I*N)th permutation ((I*N)+1)th permutation .... (I*N)+(I-1)th permutation
This shitty induction should lead you to it having even distribution.
Example:
N=2:
0,1
01 10 // these are the permutations. Each one has equal probability
N=3:
0,1|2 // the | is used to separate characters that we will insert later
01,2 10,2 // 01, 10 are permutations from N-1, 2 is the new value
210 021 012 201 120 102 // these are the permutations, still equal probability
N=4: (curved to aid reading)
0,1|23
01,2|3 10,2|3
012,3 021,3 210,3 102,3 120,3 201,3
0123 0132 0321 3230 2013 2031 2310 3012
0213 0231 0312 3210 1203 1230 1302 3201
2103 2130 2301 3102 1023 1032 1320 3021
etc
Looks OK to me (assuming rand() % N is unbiased, which it isn't). It seems it should be possible to demonstrate that every permutation of input is produced by exactly 1 sequence of random choices, where each random choice is balanced.
Compare this with a faulty implementation, such as
for (int i = 0; i < v.size(); ++i) {
swap(v[i], v[rand() % v.size()]);
}
Here you can see that there are nn equally likely ways to produce n! permutations, and since nn is not evenly divisible by n! where n > 2, some of those permutations have to be produced more often than others.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With