Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient method for randomly selecting all elements of a std::vector exactly once WITHOUT reshuffling

I am looking for an efficient method for selecting access to each element of a std::vector<T> in a random order, without reshuffling or copying them i.e no use of std::random_shuffle and ensure that each element is selected only once.

I don't want to copy or reshuffle as a) each instance of T is likely to be a very large object and b) for other operations I will be doing on the elements of the vector, it is easier for them to remain in the same order.

Furthermore, I don't really want to go down the street of continuously picking and rejecting duplicates. It is likely I will have lots of these large objects stored in the vector and efficiency is key as I will be looking to call this random selection method many times a second.

like image 302
oracle3001 Avatar asked Dec 19 '11 19:12

oracle3001


3 Answers

Create a vector the same size as the existing one that uses pointers to the elements in the vector. Randomly shuffle the pointer vector instead and read from there - it's low cost.

like image 174
John Humphreys Avatar answered Oct 05 '22 15:10

John Humphreys


You did not tell us whether you want to iterate over the whole array randomly, or if you only need some elements at random.

I assume the first case. You'll need extra storage for bookkeeping, and you'll need linear time for the shuffling anyway. So create a permutation, and keep its memory alive so that you can reshuffle it as you wish. With C++11:

#include <algorithm>
#include <random>
#include <numeric>

struct permutation
{
    permutation(size_t n) 
        : perm(n), g(std::random_device())
    {
        std::iota(perm.begin(), perm.end(), size_t(0));
    }

    void shuffle() { std::shuffle(perm.begin(), perm.end(), g); }

    size_t operator[](size_t n) const { return perm[n]; }

private:
    std::vector<size_t> perm;
    std::mt19937 g;
};

Usage:

std::vector<huge_t> v;
...

permutation sigma(v.size());
sigma.shuffle();

const huge_t& x = v[sigma[0]];

...
sigma.shuffle(); // No extra allocation
const huge_t& y = v[sigma[0]];

You can adapt the code to use C++03 std::random_shuffle, but please note that there are very few guarantees on the random number generator.

like image 34
Alexandre C. Avatar answered Oct 05 '22 16:10

Alexandre C.


I think the easiest (and one of the more efficient ones) solution would be to either create a std::vector<size_t> holding indices into your vector<T>, or a std::vector<T*> holding the pointers into your vector. Then you can shuffle that one using std::random_shuffle, iterate over it and pick the corresponding elements from your original vector. That way you don't change the order of your original vector and shuffleing pointers or size_t is pretty cheap

like image 26
Grizzly Avatar answered Oct 05 '22 16:10

Grizzly