Possible Duplicate:
Unique random numbers in O(1)?
Unique random numbers in an integer array in the C programming language
I have a std::vector
of unique elements of some undetermined size. I want to fetch 20 unique and random elements from this vector. By 'unique' I mean that I do not want to fetch the same index more than once. Currently the way I do this is to call std::random_shuffle
. But this requires me to shuffle the entire vector (which may contain over 1000 elements). I don't mind mutating the vector (I prefer not to though, as I won't need to use thread locks), but most important is that I want this to be efficient. I shouldn't be shuffling more than I need to.
Note that I've looked into passing in a partial range to std::random_shuffle
but it will only ever shuffle that subset of elements, which would mean that the elements outside of that range never get used!
Help is appreciated. Thank you!
Note: I'm using Visual Studio 2005, so I do not have access to C++11 features and libraries.
You can use Fisher Yates http://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle
The Fisher–Yates shuffle (named after Ronald Fisher and Frank Yates), also known as the Knuth shuffle (after Donald Knuth), is an algorithm for generating a random permutation of a finite set—in plain terms, for randomly shuffling the set. A variant of the Fisher–Yates shuffle, known as Sattolo's algorithm, may be used to generate random cycles of length n instead. Properly implemented, the Fisher–Yates shuffle is unbiased, so that every permutation is equally likely. The modern version of the algorithm is also rather efficient, requiring only time proportional to the number of items being shuffled and no additional storage space. The basic process of Fisher–Yates shuffling is similar to randomly picking numbered tickets out of a hat, or cards from a deck, one after another until there are no more left. What the specific algorithm provides is a way of doing this numerically in an efficient and rigorous manner that, properly done, guarantees an unbiased result.
I think this pseudocode should work (there is a chance of an off-by-one mistake or something so double check it!):
std::list chosen; // you don't have to use this since the chosen ones will be in the back of the vector
for(int i = 0; i < num; ++i) {
int index = rand_between(0, vec.size() - i - 1);
chosen.push_back(vec[index]);
swap(vec[index], vec[vec.size() - i - 1]);
}
You want a random sample of size m from an n-vector:
Let rand(a) return 0..a-1 uniform
for (int i = 0; i < m; i++)
swap(X[i],X[i+rand(n-i)]);
X[0..m-1]
is now a random sample.
Use a loop to put random index numbers into a std::set
and stop when the size()
reaches 20.
std::set<int> indexes;
std::vector<my_vector::value_type> choices;
int max_index = my_vector.size();
while (indexes.size() < min(20, max_index))
{
int random_index = rand() % max_index;
if (indexes.find(random_index) == indexes.end())
{
choices.push_back(my_vector[random_index]);
indexes.insert(random_index);
}
}
The random number generation is the first thing that popped into my head, feel free to use something better.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With