Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reproducible results when creating random matrices across parallel calls in MATLAB

I want to create a number of random matrices, but they are really big to fit in memory, so I'd like to find a way to reproduce them across computers, so that when I need to send them to another machine, I'd just need to send the code. Here is how I want to do it:

num_of_iters = 10;
K = 200;
for iter = 1:num_of_iters
    parfor j = 1:K
        R = make_random_R(iter,j,.....);
        % Do something
    end
end

What I'm worried about is the parfor loop, I need to be able to reproduce the random matrices no matter what the order of indices in the parfor is. So I decided to use a MATLAB stream for this:

  • Save the global stream
  • Create a new stream, set the seed and appropriate substream (which depends on iter and j)
  • Do the math
  • Put back the global stream

Here is my code (the variables n,p,R_type control how the random matrices are made, but they are not important, and K is the same variable as the one from above, I need it in the line substream_id = (iter - 1) * K + j;) :

function [R] = make_random_R(iter,j,n,K,p,R_type)
% Data as code
% R_type: 'posneg' or 'normdist'
% 1 <= iter <= 100
% 1 <= j <= K
% K: Number of classifiers
% n: Number of observations

assert(strcmp(R_type,'posneg') || strcmp(R_type,'normdist'),'R_type must be posneg or normdist');
assert(iter >= 1,'Error: iter >= 1 not satisfied');
assert((1 <= j) && (j <= K),'Error: 1 <= j <= K not satisfied');
assert(K > 0,'Error: K > 0 not satisfied');

globalStream = RandStream.getGlobalStream;
globalState =  globalStream.State;

stream=RandStream('mlfg6331_64','Seed',1);
substream_id = (iter - 1) * K + j;
stream.Substream = substream_id;
RandStream.setGlobalStream(stream);

switch R_type
    case 'posneg'        
        q0=ceil(2*log(n)/0.25^2)+1;
        if (q0 < p)
            q = q0;
        else
            q = ceil(p/2);
        end
        R = randi([0 1],p,q);
        R(R == 0) = -1;
    case 'normdist'        
        q = 2*ceil(log2(p));
        R = normrnd(0,1,[p,q]);    
end
RandStream.setGlobalStream(globalStream);
globalStream.State = globalState;
end

Tried some code and here it is:

>> iter = 2;
>> j = 3;
>> n=100;
>> K=10;
>> p=6;
>> R_type = 'normdist';
>> for j=1:K
j
make_ran
>> parfor j=1:K
j
make_random_R(iter,j,n,K,p,R_type)
end
Starting parallel pool (parpool) using the 'local' profile ... connected to 4 workers.

ans =

     7


ans =

   -0.3660    0.8816    1.1754   -0.4987   -1.8612   -0.3683
    0.9504   -0.3067   -0.5156   -0.2383   -1.1661    0.3622
    2.0743   -0.4195    0.5021    0.3954    0.2415   -0.4552
   -0.0474   -0.1645   -0.1725   -0.4938   -0.2559    0.2188
    1.0735    0.3660    0.1043    0.4403   -0.3166    1.1241
   -1.0421   -1.4528   -0.4976   -0.7166   -1.1328   -2.0260


ans =

     2


ans =

   -1.6629    0.0213   -1.8138   -0.4375    0.3575   -0.0353
    0.6653   -1.2662   -0.3977   -0.6540   -1.2131    0.4858
    0.3421    1.1266   -0.6066   -1.2095    1.5496   -0.9341
    0.2145    0.7192   -2.2087    0.7597   -0.0110   -1.1282
   -0.3511   -0.7305   -0.1143    0.0242    0.2431   -0.8612
    0.5875    1.2665   -2.1943   -0.4879    0.0120   -1.1539


ans =

     1


ans =

   -0.5300    2.4077   -0.3478    1.8695   -1.1327   -1.0734
   -0.2540   -1.1265    0.3152    0.4265    1.2777    0.0959
    0.5005   -0.7557    0.6194    1.5873    0.0961   -1.9216
    0.7275    0.5420   -0.6237   -0.2228    0.8915    0.4644
    0.8131   -0.1492    0.9232    0.8410   -0.0637    2.1163
   -1.1995    0.2338   -1.3726    0.1604   -0.1855    1.3826


ans =

     8


ans =

   -0.5146    2.2106    2.7200   -1.2136    1.0004    1.3089
    0.7225    0.2746   -0.8798    0.2978   -0.8490    1.6744
    1.1998   -0.0363    1.9105   -0.7747   -0.8707   -0.6823
    0.6801    1.3194   -0.0685    0.5944    1.5078   -1.6821
    0.0876    1.2150   -0.0747    0.0324   -1.1552    0.0966
   -0.0624   -0.3874   -0.5356    0.6353    1.4090   -1.1014


ans =

     6


ans =

    0.5866   -1.0222   -0.2168    0.8582    1.4360    0.0699
    2.0677   -0.4740   -0.8763    1.7827    0.1930   -1.2167
   -0.3941   -0.5441    0.3719   -0.0609    0.7138   -1.0920
    0.3622   -0.0459   -0.0221    0.2030   -0.7695   -0.8963
   -0.1986   -0.2560    0.6666    0.4831   -1.2028   -0.9423
    0.1656    1.2006   -1.1131    0.7704   -0.6906   -1.3143


ans =

     5


ans =

   -0.5782   -0.3634    1.5381   -1.3173   -0.9493    0.8480
    1.5921   -0.4069    0.7795   -0.3390   -0.1071    0.4201
   -0.0184    0.2865   -0.1139   -0.1171    0.2288    0.5511
    0.1787    0.7583    0.3994    1.0457    0.3291   -0.9150
    0.3641   -0.6420   -0.2096    0.7761    0.4022   -0.7478
    0.1165    0.7142    0.7029   -1.1195    0.0905    0.6810


ans =

     4


ans =

    0.1246   -0.3173    0.8068    0.6485   -0.8572    0.2275
    0.3674   -0.0507   -0.9196    0.6161   -0.5821   -0.4291
   -1.0142   -1.1614   -2.5438    1.5915    2.0356    0.4535
   -0.2111   -0.3974    0.0376    0.3825   -1.9702    1.5318
   -0.3890    0.9210   -0.0635    0.3248    1.8666   -0.0160
    1.3908   -0.7204   -0.6772   -0.0713    0.0569    0.5929


ans =

     3


ans =

   -0.1602    0.6891    0.4725    0.0277   -2.0510   -2.2440
   -0.7497    1.8225   -0.4433    0.4090    0.9021   -1.6683
    0.0659    0.3909    0.2043    0.9065    1.4630    0.3091
   -0.3886    0.6715   -0.9742   -0.5468    0.2890    0.5625
   -0.4558    0.4770   -0.1888   -0.6504    0.3281    1.3767
    0.3983    0.5834    0.9360    0.8604   -0.9776    0.6755


ans =

    10


ans =

   -0.4843   -0.4512    0.7544    0.7585   -0.4417   -0.0208
    1.8537   -1.6935   -2.7067   -0.5077    0.9616   -1.7904
   -1.6943   -1.0988    0.1208   -0.8100    1.8778    1.1654
    1.1759   -0.7087   -1.2673   -0.1381   -0.0710    0.5343
    0.2589   -0.5128   -0.3970    0.6737    0.8097    2.7024
   -0.8933    0.2810    0.8117   -0.5428   -0.8782    1.1746


ans =

     9


ans =

    0.0254   -0.7993    1.5164    1.2921   -1.1013    1.8556
   -0.6280    0.9374   -0.1962    0.1685   -0.5079    0.4333
   -0.3962   -0.9977    0.6971   -1.0310   -1.1997   -2.1391
    0.7179    1.0177   -0.8874   -0.6732    0.7295    1.4448
   -1.1793   -1.3210    1.5292    0.2280    1.9337    1.0901
   -0.0926    0.1798   -1.1740    0.3447    2.4578    0.4170

I wonder if the code is correct, and does it retain the state of the global stream after the function call? Please help me, thank you very much

like image 573
Dang Manh Truong Avatar asked Mar 04 '17 15:03

Dang Manh Truong


People also ask

How do I stop random numbers from changing in MATLAB?

One simple way to avoid repeating the same random numbers in a new MATLAB session is to choose a different seed for the random number generator. rng gives you an easy way to do that, by creating a seed based on the current time. Each time you use 'shuffle' , it reseeds the generator with a different seed.

What does rng do in MATLAB?

Description. rng( seed ) specifies the seed for the MATLAB® random number generator. For example, rng(1) initializes the Mersenne Twister generator using a seed of 1 . The rng function controls the global stream, which determines how the rand , randi , randn , and randperm functions produce a sequence of random numbers ...

How do I get the same random number in MATLAB?

Also, any script or function that calls the random number functions returns the same result whenever you restart. If you want to avoid repeating the same random number arrays when MATLAB restarts, then execute the command, rng('shuffle'); before calling rand , randn , randi , or randperm .

How do you generate a random square matrix in MATLAB?

rand (MATLAB Functions) The rand function generates arrays of random numbers whose elements are uniformly distributed in the interval ( 0 , 1 ). Y = rand(n) returns an n -by- n matrix of random entries.


1 Answers

MATLAB generate random number from a "KNOWN" but "complex" function, so you can generate same sequence of random! number .

to do so you should use "rng" function
set rng to default so it reset the random number generation .

rng('default');
rand(1,5)
ans=
0.8147  0.9058  0.1270  0.9134  0.6324

then if you want to reproduce this random number sequence you could write:

rng('default');
rand(1,5)
ans=
    0.8147  0.9058  0.1270  0.9134  0.6324

another way is to get/set seed for random number generation :

s=rng;
rand(1,5)
ans= 0.0975  0.2785  0.5469  0.9575  0.9649

and to reproduce the same seq.

rng(s);
rand(1,5)
ans=
     0.0975  0.2785  0.5469  0.9575  0.9649

noe that "rng" itself is a rich function and you could study it more in matlab documentation.

(I hope that I understand the question correctly)

to answer comment from @m7913d

"By default, the MATLAB® client and MATLAB workers use different random number generators, even if the workers are part of a local cluster on the same machine with the client. For the client, the default is the Mersenne Twister generator ('twister'), and for the workers the default is the Combined Multiple Recursive generator ('CombRecursive' or 'mrg32k3a'). If it is necessary to generate the same stream of numbers in the client and workers, you can set one to match the other."

clearly matlab workers using a different function (for generating random numbers) from matlab itself so you could set the default function in "rng" function too,

rng(5,'twister') % for using twister method to generate random number

for more info try this doc from matlab

new example to show that you could reproduce random number sequence in parfor :

>> parfor i=1:3
rng(3,'twister');
rand(1,5)
i
end

ans =

    0.5508    0.7081    0.2909    0.5108    0.8929


ans =

     3


ans =

    0.5508    0.7081    0.2909    0.5108    0.8929


ans =

     1


ans =

    0.5508    0.7081    0.2909    0.5108    0.8929


ans =

     2

essential part from comments:

about the "client workers" as i said before matlab uses a default function for all of your client worker so the results are completely reproducible, just the default function to create random numbers are different from matlab own default function. it is not a problem you could set the default function in the matlab and its workers with rng function (last example in the answer )


about using a global stream, it is no difference to use a global stream , you should set the seed value with rng function to get the same reproducible results, global stream just indicate that which function you are using to generate random numbers , matlab has about 6 function to do this job but if you have your own function you could set it on the global stream, and then again you should use rng function to get same sequence each time you generate random numbers

like image 200
Hadi Avatar answered Oct 06 '22 11:10

Hadi