I am having trouble figuring out a decent way of randomly shuffling the elements in an <code>std::vector</code> and, after some operations, restoring the original order. I know that this should be a rather trivial algorithm, but I guess I'm too tired... Since I am constrained to use a custom random number generator class, I guess I can't use <code>std::random_shuffle</code>, which doesn't help anyway, because I also need to preserve the original order. So, my approach was to create an <code>std::map</code> which serves as a mapping between the original positions and the random ones, like this: <pre class="prettyprint"><code>std::map<unsigned int, unsigned int> getRandomPermutation (const unsigned int &numberOfElements) { std::map<unsigned int, unsigned int> permutation; //populate the map for (unsigned int i = 0; i < numberOfElements; i++) { permutation[i] = i; } //randomize it for (unsigned int i = 0; i < numberOfElements; i++) { //generate a random number in the interval [0, numberOfElements) unsigned long randomValue = GetRandomInteger(numberOfElements - 1U); //broken swap implementation //permutation[i] = randomValue; //permutation[randomValue] = i; //use this instead: std::swap(permutation[i], permutation[randomValue]); } return permutation; } </code></pre> I am not sure that the above algorithm is a proper implementation for a random permutation, so any improvements are welcome. Now, here is how I've managed to make use of this permutation map: <pre class="prettyprint"><code>std::vector<BigInteger> doStuff (const std::vector<BigInteger> &input) { /// Permute the values in a random order std::map<unsigned int, unsigned int> permutation = getRandomPermutation(static_cast<unsigned int>(input.size())); std::vector<BigInteger> temp; //permute values for (unsigned int i = 0; i < static_cast<unsigned int>(input.size()); ++i) { temp.push_back(input[permutation[i]]); } //do all sorts of stuff with temp /// Reverse the permutation std::vector<BigInteger> output; for (unsigned int i = 0; i < static_cast<unsigned int>(input.size()); ++i) { output.push_back(temp[permutation[i]]); } return output; } </code></pre> Something tells me that I should be able to use only one <code>std::vector<BigInteger></code> for this algorithm, but, right now, I just can't figure out the optimal solution. Honestly, I don't really care about the data in <code>input</code>, so I could even make it non-const, overwrite it, and skip creating a copy of it, but the question is how to implement the algorithm? If I do something like this, I end up shooting myself in the foot, right? :) <pre class="prettyprint"><code>for (unsigned int i = 0; i < static_cast<unsigned int>(input.size()); ++i) { BigInteger aux = input[i]; input[i] = input[permutation[i]]; input[permutation[i]] = aux; } </code></pre> <hr> EDIT: Following Steve's remark about using "Fisher-Yates" shuffle, I changed my <code>getRandomPermutation</code> function accordingly: <pre class="prettyprint"><code>std::map<unsigned int, unsigned int> getRandomPermutation (const unsigned int &numberOfElements) { std::map<unsigned int, unsigned int> permutation; //populate the map for (unsigned int i = 0; i < numberOfElements; i++) { permutation[i] = i; } //randomize it for (unsigned int i = numberOfElements - 1; i > 0; --i) { //generate a random number in the interval [0, numberOfElements) unsigned long randomValue = GetRandomInteger(i); std::swap(permutation[i], permutation[randomValue]); } return permutation; } </code></pre>

If you're "randomising" a vector of n elements, you can create another <code>std::vector<size_t> index(n)</code>, set <code>index[x] = x</code> for <code>0 <= x < n</code>, then shuffle <code>index</code>. Then your lookups take the form: <code>original_vector[index[i]]</code>. The order of the original vector's never changed so no need to restore ordering. <blockquote> ...constrained to use a custom random number generator class, I guess I can't use <code>std::random_shuffle</code>... </blockquote> Have you noticed this overload? <pre class="prettyprint"><code>template <class RandomAccessIterator, class RandomNumberGenerator> void random_shuffle ( RandomAccessIterator first, RandomAccessIterator last, RandomNumberGenerator& rand ); </code></pre> For details of how to wrap your random number generator with a compatible object, see http://www.sgi.com/tech/stl/RandomNumberGenerator.html

Random Permutations

Tags:

c++

algorithm

random

mapping

permutation

I am having trouble figuring out a decent way of randomly shuffling the elements in an std::vector and, after some operations, restoring the original order. I know that this should be a rather trivial algorithm, but I guess I'm too tired...

Since I am constrained to use a custom random number generator class, I guess I can't use std::random_shuffle, which doesn't help anyway, because I also need to preserve the original order. So, my approach was to create an std::map which serves as a mapping between the original positions and the random ones, like this:

std::map<unsigned int, unsigned int> getRandomPermutation (const unsigned int &numberOfElements)
{
    std::map<unsigned int, unsigned int> permutation;

    //populate the map
    for (unsigned int i = 0; i < numberOfElements; i++)
    {
        permutation[i] = i;
    }

    //randomize it
    for (unsigned int i = 0; i < numberOfElements; i++)
    {
        //generate a random number in the interval [0, numberOfElements)
        unsigned long randomValue = GetRandomInteger(numberOfElements - 1U);

        //broken swap implementation
        //permutation[i] = randomValue;
        //permutation[randomValue] = i;

        //use this instead:
        std::swap(permutation[i], permutation[randomValue]);
    }

    return permutation;
}

I am not sure that the above algorithm is a proper implementation for a random permutation, so any improvements are welcome.

Now, here is how I've managed to make use of this permutation map:

std::vector<BigInteger> doStuff (const std::vector<BigInteger> &input)
{
    /// Permute the values in a random order
    std::map<unsigned int, unsigned int> permutation = getRandomPermutation(static_cast<unsigned int>(input.size()));

    std::vector<BigInteger> temp;

    //permute values
    for (unsigned int i = 0; i < static_cast<unsigned int>(input.size()); ++i)
    {
        temp.push_back(input[permutation[i]]);
    }

    //do all sorts of stuff with temp

    /// Reverse the permutation
    std::vector<BigInteger> output;
    for (unsigned int i = 0; i < static_cast<unsigned int>(input.size()); ++i)
    {
        output.push_back(temp[permutation[i]]);
    }

    return output;
}

Something tells me that I should be able to use only one std::vector<BigInteger> for this algorithm, but, right now, I just can't figure out the optimal solution. Honestly, I don't really care about the data in input, so I could even make it non-const, overwrite it, and skip creating a copy of it, but the question is how to implement the algorithm?

If I do something like this, I end up shooting myself in the foot, right? :)

for (unsigned int i = 0; i < static_cast<unsigned int>(input.size()); ++i)
{
    BigInteger aux = input[i];
    input[i] = input[permutation[i]];
    input[permutation[i]] = aux;
}

EDIT: Following Steve's remark about using "Fisher-Yates" shuffle, I changed my getRandomPermutation function accordingly:

std::map<unsigned int, unsigned int> getRandomPermutation (const unsigned int &numberOfElements)
{
    std::map<unsigned int, unsigned int> permutation;

    //populate the map
    for (unsigned int i = 0; i < numberOfElements; i++)
    {
        permutation[i] = i;
    }

    //randomize it
    for (unsigned int i = numberOfElements - 1; i > 0; --i)
    {
        //generate a random number in the interval [0, numberOfElements)
        unsigned long randomValue = GetRandomInteger(i);

        std::swap(permutation[i], permutation[randomValue]);
    }

    return permutation;
}

238

asked Jun 22 '12 01:06

Mihai Todor

3 Answers

If you're "randomising" a vector of n elements, you can create another std::vector<size_t> index(n), set index[x] = x for 0 <= x < n, then shuffle index. Then your lookups take the form: original_vector[index[i]]. The order of the original vector's never changed so no need to restore ordering.

...constrained to use a custom random number generator class, I guess I can't use std::random_shuffle...

Have you noticed this overload?

template <class RandomAccessIterator, class RandomNumberGenerator>
void random_shuffle ( RandomAccessIterator first, RandomAccessIterator last,
                    RandomNumberGenerator& rand );

For details of how to wrap your random number generator with a compatible object, see http://www.sgi.com/tech/stl/RandomNumberGenerator.html

answered Nov 15 '22 05:11

Tony Delroy

If you're looking for specific errors in your code:

permutation[i] = randomValue;
permutation[randomValue] = i;

is wrong. Observe that once you're finished, each value does not necessarily appear exactly once among the values of the map. So it's not a permutation, let alone a uniformly-distributed random one.

The proper means to generate a random permutation is what Tony says, use std::random_shuffle on a vector that initially represents the identity permutation. Or if you want to know how a shuffle is properly performed, look up "Fisher-Yates". In general, any approach that makes N random selections uniformly from 0 .. N-1 is doomed to failure, because that means it has N^N possible ways it can run. But there are N! possible permutations of N items, and N^N is generally not divisible by N!. Hence it's impossible for each permutation to be the result of an equal number of random selections, i.e. the distribution is not uniform.

the question is how to implement the algorithm?

So, you have your permutation, and you want to re-order the elements of input in-place, according to that permutation.

The key thing to know is that every permutation is a composition of "cycles". That is to say, if you repeatedly follow the permutation from a given starting point, you come back to where you started (and this path is the cycle to which that starting point belongs). There may be more than one such cycle in a given permutation, and if permutation[i] == i for some i, then the cycle of i has length 1.

The cycles are all disjoint, that is to say each element appears in exactly one cycle. Because cycles don't "interfere" with each other, we can apply a permutation by applying each cycle, and we can do the cycles in any order. So, for each index i we need to:

check whether we've already done i. If so, move on to the next index.
set current = i
swap index[current] with index[permutation[current]]. So index[current] is set to its correct value (the next element in the cycle), and its old value is "pushed" forward along the cycle.
mark current as "done"
if permutuation[current] is i, we've finished the cycle. So the first value of the cycle ends up in the spot formerly occupied by the last element of the cycle, which is right. Move on to the next index.
set current = permutation[current] and go back to the swap step.

Depending on the types involved, you can optimize around the swaps - it may be better to copy/move to a temporary variable and the start of each cycle, then do a copy/move instead of a swap at each step of the cycle, and finally copy/move the temporary to the end of the cycle.

Reversing the process is the same, but using the "inverse" of the permutation. The inverse inv of a permutation perm, is the permutation such that inv[perm[i]] == i for each i. You can either compute the inverse and use the exact code above, or you can use code similar to the above, except move the elements in the opposite direction along each cycle.

An alternative to all that, since you implemented Fisher-Yates yourself -- as you're running Fisher-Yates, for each swap you perform record the two indices swapped in a vector<pair<size_t,size_t>>. Then you don't have to worry about cycles. You can apply the permutation to the vector by applying the same sequence of swaps. You can reverse the permutation by applying the reversed sequence of swaps.

answered Nov 15 '22 05:11

5 revs

Note that, depending on your application, if it is important that you have a truly uniformly distributed permutation, you cannot use any algorithm that calls a typical pseudo-random number generator more that once.

The reason is that most pseudo-random number generators, such as the one in clib, are Linear congruential. Those have a weakness where they'll generate numbers that cluster in planes - so your permutations will not be perfectly uniformly distributed. Using a higher-quality generator should get around that.

See http://en.wikipedia.org/wiki/Linear_congruential_generator

Alternatively, you could just generate a single random number in the range 0..(n!-1) and pass it to the unrank function for permutations. For small enough n, you can store those and get a constant time algorithm, but if n is too large for that, the best unrank function is O(n). Applying the resulting permutation is going to be O(n) anyway.

answered Nov 15 '22 04:11

DavidJ

Related questions
                            
                                Size of the classes in case of virtual inheritance
                            
                                Different Unallocated Memory Behaviour Between Visual Studio Versions
                            
                                std::ostringstream operator overload search order?
                            
                                error C2676 when compiling
                            
                                How to parse entries followed by semicolon or newline (boost::spirit)?
                            
                                Arrays in J programming Language
                            
                                libGL heap usage
                            
                                year duration algorithm in c++
                            
                                Clang 3.1 and user defined literals
                            
                                How to compute CDF probability of normal distribution in C++?
                            
                                Policy Based Design with Variadic Templates
                            
                                Calling a C++ member function pointer: this-pointer gets corrupted
                            
                                Does OpenMP copy private objects?
                            
                                To identify the background process & bring to foreground in C++ WIN32
                            
                                manually printing a N-byte integer
                            
                                Do I need to disconnect an xcb_connection_t that I got from XGetXCBConnection?
                            
                                Why MS doesn't have in its win32api a function that returns the font file name, given the font handle?
                            
                                Optical flow using opencv
                            
                                Is the "this" pointer RTTI-enabled?
                            
                                Circular references between two classes

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With