In MATLAB, I am using the shake.m function (http://www.mathworks.com/matlabcentral/fileexchange/10067-shake) to randomly shuffle each column. For example:
a = [1 2 3; 4 5 6; 7 8 9]
a =
1 2 3
4 5 6
7 8 9
b = shake(a)
b =
7 8 6
1 5 9
4 2 3
This function does exactly what I want, however my columns are very long (>10,000,000) and so this takes a long time to run. Does anyone know of a faster way of achieving this? I have tried shaking each column vector separately but this isn't faster. Thanks!
randperm(n) returns a row vector that contains a random permutation of the integers from “1” to “n” without of any repetition. randperm(n,k) returns a row vector that contains “k” number of unique integers that are selected randomly from 1 to n.
Use the rand , randn , and randi functions to create sequences of pseudorandom numbers, and the randperm function to create a vector of randomly permuted integers. Use the rng function to control the repeatability of your results.
p = randperm( n ) returns a row vector containing a random permutation of the integers from 1 to n without repeating elements.
Shuffle - Random permutation of array elements. This function is equivalent to X(RANDPERM(LENGTH(X)), but 50% to 85% faster.
You can use randperm
like this, but I don't know if it will be any faster than shake
:
[m,n]=size(a)
for c = 1:n
a(randperm(m),c) = a(:,c);
end
Or you can try switch the randperm
around to see which is faster (should produce the same result):
[m,n]=size(a)
for c = 1:n
a(:,c) = a(randperm(m),c);
end
Otherwise how many rows do you have? If you have far fewer rows than columns, it's possible that we can assume each permutation will be repeated, so what about something like this:
[m,n]=size(a)
cols = randperm(n);
k = 5; %//This is a parameter you'll need to tweak...
set_size = floor(n/k);
for set = 1:set_size:n
set_cols = cols(set:(set+set_size-1))
a(:,set_cols) = a(randperm(m), set_cols);
end
which would massively reduce the number of calls to randperm
. Breaking it up into k
equal sized sets might not be optimal though, you might want to add some randomness to that as well. The basic idea here though is that there will only be factorial(m)
different orderings, and if m
is much smaller than n
(e.g. m=5
, n=100000
like your data), then these orderings will be repeated naturally. So instead of letting that occur by itself, rather manage the process and reduce the calls to randperm
which would be producing the same result anyway.
Here's a simple vectorized approach. Note that it creates an auxiliary matrix (ind
) the same size as a
, so depending on your memory it may be usable or not.
[~, ind] = sort(rand(size(a))); %// create a random sorting for each column
b = a(bsxfun(@plus, ind, 0:size(a,1):numel(a)-1)); %// convert to linear index
Obtain shuffled indices using randperm
idx = randperm(size(a,1));
Use the indices to shuffle the vector:
m = size(a,1);
for i=1:m
b(:,i) = a(randperm(m,:);
end
Look at this answer: Matlab: How to random shuffle columns of matrix
Here's a no-loop approach as it processes all indices at once and I believe this is as random as one could get given the requirements of shuffling among each column only.
Code
%// Get sizes
[m,n] = size(a);
%// Create an array of randomly placed sequential indices from 1 to numel(a)
rand_idx = randperm(m*n);
%// segregate those indices into rows and cols for the size of input data, a
col = ceil(rand_idx/m);
row = rem(rand_idx,m);
row(row==0)=m;
%// Sort both these row and col indices based on col, such that we have col
%// as 1,1,1,1 ...2,2,2,....3,3,3,3 and so on, which would represent per col
%// indices for the input data. Use these indices to linearly index into a
[scol,ind1] = sort(col);
a(1:m*n) = a((scol-1)*m + row(ind1))
Final output is obtained in a
itself.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With