Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to randomly shuffle values in a map?

I have a std::map with both key and value as integers. Now I want to randomly shuffle the map, so keys point to a different value at random. I tried random_shuffle but it doesn't compile. Note that I am not trying to shuffle the keys, which makes no sense for a map. I'm trying to randomise the values.

I could push the values into a vector, shuffle that and then copy back. Is there a better way?

like image 961
Neil Kirk Avatar asked Apr 17 '13 19:04

Neil Kirk


People also ask

What is the use of Shuffle function in Python?

shuffle () is an inbuilt method of the random module. It is used to shuffle a sequence (list). Shuffling means changing the position of the elements of the sequence. Here, the shuffling operation is inplace. function : optional and by default is random (). It should return a value between 0 and 1.

What is the range of the value it returns when Shuffle ()?

It should return a value between 0 and 1. The shuffle () method cannot be used to shuffle immutable datatypes like strings.

How to shuffle an array in JavaScript?

In contrary to languages like Ruby and PHP, JavaScript does not have an built-in method of shuffling the array. However, there is a method that allows shuffling the sequence. Let’s discuss a particular case. Shuffling an array of values is considered one of the oldest problems in computer science.

What is the Fisher-Yates shuffle algorithm?

Let’s discuss a particular case. Shuffling an array of values is considered one of the oldest problems in computer science. Shuffling is possible with the Fisher-Yates shuffle algorithm for generating a random permutation of a finite sequence. That is to say, the algorithm shuffles the sequence.


2 Answers

You can push all the keys in a vector, shuffle the vector and use it to swap the values in the map.

Here is an example:

#include <iostream>
#include <string>
#include <vector>
#include <map>
#include <algorithm>
#include <random>
#include <ctime>

using namespace std;
int myrandom (int i) { return std::rand()%i;}
int main ()
{
    srand(time(0));
    map<int,string> m;
    vector<int> v;
    for(int i=0; i<10; i++)
        m.insert(pair<int,string>(i,("v"+to_string(i))));

    for(auto i: m)
    {
        cout << i.first << ":" << i.second << endl;
        v.push_back(i.first);
    }
    random_shuffle(v.begin(), v.end(),myrandom);
    vector<int>::iterator it=v.begin();
    cout << endl;
    for(auto& i:m)
    {
        string ts=i.second;
        i.second=m[*it];
        m[*it]=ts;
        it++;
    }
    for(auto i: m)
    {
        cout << i.first << ":" << i.second << endl;
    }
    return 0;
}
like image 93
Johnny Mnemonic Avatar answered Oct 24 '22 03:10

Johnny Mnemonic


The complexity of your proposal is O(N), (both the copies and the shuffle have linear complexity) which seems optimal (looking at less elements would introduce non-randomness into your shuffle).

If you want to repeatedly shuffle your data, you could maintain a map of type <Key, size_t> (i.e. the proverbial level of indirection) that indexes into a std::vector<Value> and then just shuffle that vector repeatedly. That saves you all the copying in exchange for O(N) space overhead. If the Value type itself is expensive, you have an extra vector<size_t> of indices into the real data on which you do the shuffling.

For convenience sake, you could encapsulate the map and vector inside one class that exposes a shuffle() member function. Such a wrapper would also need to expose the basic lookup / insertion / erase functionality of the underyling map.

EDIT: As pointed out by @tmyklebu in the comments, maintaining (raw or smart) pointers to secondary data can be subject to iterator invalidation (e.g. when inserting new elements at the end that causes the vector's capacity to be resized). Using indices instead of pointers solves the "insertion at the end" problem. But when writing the wrapper class you need to make sure that insertions of new key-value pairs never cause "insertions in the middle" for your secondary data because that would also invalidate the indices. A more robust library solution would be to use Boost.MultiIndex, which is specifically designed to allow multiple types of view over a data structure.

like image 37
TemplateRex Avatar answered Oct 24 '22 03:10

TemplateRex