Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a better/faster way of randomly shuffling a matrix in MATLAB?

In MATLAB, I am using the shake.m function (http://www.mathworks.com/matlabcentral/fileexchange/10067-shake) to randomly shuffle each column. For example:

a = [1 2 3; 4 5 6; 7 8 9]
a =

     1     2     3
     4     5     6
     7     8     9

b = shake(a)
b =

     7     8     6
     1     5     9
     4     2     3

This function does exactly what I want, however my columns are very long (>10,000,000) and so this takes a long time to run. Does anyone know of a faster way of achieving this? I have tried shaking each column vector separately but this isn't faster. Thanks!

like image 854
user2861089 Avatar asked Aug 29 '14 06:08

user2861089


People also ask

How do you randomly shuffle a matrix in Matlab?

randperm(n) returns a row vector that contains a random permutation of the integers from “1” to “n” without of any repetition. randperm(n,k) returns a row vector that contains “k” number of unique integers that are selected randomly from 1 to n.

How do you randomize data in Matlab?

Use the rand , randn , and randi functions to create sequences of pseudorandom numbers, and the randperm function to create a vector of randomly permuted integers. Use the rng function to control the repeatability of your results.

How do you randomly permute a list in Matlab?

p = randperm( n ) returns a row vector containing a random permutation of the integers from 1 to n without repeating elements.

What is Matlab shuffle?

Shuffle - Random permutation of array elements. This function is equivalent to X(RANDPERM(LENGTH(X)), but 50% to 85% faster.


4 Answers

You can use randperm like this, but I don't know if it will be any faster than shake:

[m,n]=size(a)
for c = 1:n
    a(randperm(m),c) = a(:,c);
end

Or you can try switch the randperm around to see which is faster (should produce the same result):

[m,n]=size(a)
for c = 1:n
    a(:,c) = a(randperm(m),c);
end

Otherwise how many rows do you have? If you have far fewer rows than columns, it's possible that we can assume each permutation will be repeated, so what about something like this:

[m,n]=size(a)
cols = randperm(n);
k = 5;  %//This is a parameter you'll need to tweak...
set_size = floor(n/k);
for set = 1:set_size:n
    set_cols = cols(set:(set+set_size-1))
    a(:,set_cols) = a(randperm(m), set_cols);
end

which would massively reduce the number of calls to randperm. Breaking it up into k equal sized sets might not be optimal though, you might want to add some randomness to that as well. The basic idea here though is that there will only be factorial(m) different orderings, and if m is much smaller than n (e.g. m=5, n=100000 like your data), then these orderings will be repeated naturally. So instead of letting that occur by itself, rather manage the process and reduce the calls to randperm which would be producing the same result anyway.

like image 124
Dan Avatar answered Oct 10 '22 06:10

Dan


Here's a simple vectorized approach. Note that it creates an auxiliary matrix (ind) the same size as a, so depending on your memory it may be usable or not.

[~, ind] = sort(rand(size(a))); %// create a random sorting for each column
b = a(bsxfun(@plus, ind, 0:size(a,1):numel(a)-1)); %// convert to linear index
like image 32
Luis Mendo Avatar answered Oct 10 '22 06:10

Luis Mendo


Obtain shuffled indices using randperm

idx = randperm(size(a,1));

Use the indices to shuffle the vector:

m = size(a,1);
for i=1:m
 b(:,i) = a(randperm(m,:);
end

Look at this answer: Matlab: How to random shuffle columns of matrix

like image 31
lakshmen Avatar answered Oct 10 '22 05:10

lakshmen


Here's a no-loop approach as it processes all indices at once and I believe this is as random as one could get given the requirements of shuffling among each column only.

Code

%// Get sizes
[m,n] = size(a);

%// Create an array of randomly placed sequential indices from 1 to numel(a)
rand_idx = randperm(m*n);

%// segregate those indices into rows and cols for the size of input data, a
col = ceil(rand_idx/m);
row = rem(rand_idx,m);
row(row==0)=m;

%// Sort both these row and col indices based on col, such that we have col
%// as 1,1,1,1 ...2,2,2,....3,3,3,3 and so on, which would represent per col
%// indices for the input data. Use these indices to linearly index into a
[scol,ind1] = sort(col);
a(1:m*n) = a((scol-1)*m + row(ind1))

Final output is obtained in a itself.

like image 42
Divakar Avatar answered Oct 10 '22 06:10

Divakar