Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get random and unique values from a vector? [duplicate]

Tags:

c++

Possible Duplicate:
Unique random numbers in O(1)?
Unique random numbers in an integer array in the C programming language

I have a std::vector of unique elements of some undetermined size. I want to fetch 20 unique and random elements from this vector. By 'unique' I mean that I do not want to fetch the same index more than once. Currently the way I do this is to call std::random_shuffle. But this requires me to shuffle the entire vector (which may contain over 1000 elements). I don't mind mutating the vector (I prefer not to though, as I won't need to use thread locks), but most important is that I want this to be efficient. I shouldn't be shuffling more than I need to.

Note that I've looked into passing in a partial range to std::random_shuffle but it will only ever shuffle that subset of elements, which would mean that the elements outside of that range never get used!

Help is appreciated. Thank you!

Note: I'm using Visual Studio 2005, so I do not have access to C++11 features and libraries.

like image 312
void.pointer Avatar asked Dec 07 '12 23:12

void.pointer


3 Answers

You can use Fisher Yates http://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle

The Fisher–Yates shuffle (named after Ronald Fisher and Frank Yates), also known as the Knuth shuffle (after Donald Knuth), is an algorithm for generating a random permutation of a finite set—in plain terms, for randomly shuffling the set. A variant of the Fisher–Yates shuffle, known as Sattolo's algorithm, may be used to generate random cycles of length n instead. Properly implemented, the Fisher–Yates shuffle is unbiased, so that every permutation is equally likely. The modern version of the algorithm is also rather efficient, requiring only time proportional to the number of items being shuffled and no additional storage space. The basic process of Fisher–Yates shuffling is similar to randomly picking numbered tickets out of a hat, or cards from a deck, one after another until there are no more left. What the specific algorithm provides is a way of doing this numerically in an efficient and rigorous manner that, properly done, guarantees an unbiased result.

I think this pseudocode should work (there is a chance of an off-by-one mistake or something so double check it!):

std::list chosen; // you don't have to use this since the chosen ones will be in the back of the vector
for(int i = 0; i < num; ++i) {
  int index = rand_between(0, vec.size() - i - 1);
  chosen.push_back(vec[index]);
  swap(vec[index], vec[vec.size() - i - 1]);
}
like image 75
Pubby Avatar answered Oct 10 '22 04:10

Pubby


You want a random sample of size m from an n-vector:

Let rand(a) return 0..a-1 uniform

for (int i = 0; i < m; i++)
    swap(X[i],X[i+rand(n-i)]);

X[0..m-1] is now a random sample.

like image 23
Andrew Tomazos Avatar answered Oct 10 '22 02:10

Andrew Tomazos


Use a loop to put random index numbers into a std::set and stop when the size() reaches 20.

std::set<int> indexes;
std::vector<my_vector::value_type> choices;
int max_index = my_vector.size();
while (indexes.size() < min(20, max_index))
{
    int random_index = rand() % max_index;
    if (indexes.find(random_index) == indexes.end())
    {
        choices.push_back(my_vector[random_index]);
        indexes.insert(random_index);
    }
}

The random number generation is the first thing that popped into my head, feel free to use something better.

like image 34
Mark Ransom Avatar answered Oct 10 '22 03:10

Mark Ransom