Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I share memory between processes in MATLAB?

Is there any way to share memory between MATLAB processes on the same computer?

I am running several MATLAB processes on a multi-core computer (running Windows, if it matters). They all use the same gigantic input data. It would be nice to only have a single copy of it in memory.

Edit: Unfortunately each process needs access to the whole gigantic input data, so there is no way to divide the data and conquer the problem.

like image 782
AnnaR Avatar asked May 16 '09 11:05

AnnaR


2 Answers

If the processes only ever read the data, but do not modify it, then I believe you can place your input data into one large file and have each process open and read from that file. Each process will have it's own file position indicator that it can move anywhere in the file to read the data it needs. I tested having two MATLAB processes reading simultaneously from a file a million or so times each and everything seemed to work fine. I only used basic file I/O commands (listed below). It appears you could also do this using MEMMAPFILE, as Mr Fooz mentioned in his answer (and SCFrench in a comment), assuming you have MATLAB version R2008a or newer.

Here are some of the file I/O commands that you will likely use for this:

  • FOPEN: Each process will call FOPEN and return a file identifier it will use in all subsequent calls. You can open a file in either binary or text mode:

    fid = fopen('data.dat','r');   % Binary mode
    fid = fopen('data.txt','rt');  % Text mode
    
  • FREAD: In binary mode, FREAD will read data from the file:

    A = fread(fid,20,'double');  % Reads 20 double-precision values
    
  • FSCANF: In text mode, FSCANF will read and format data from the file:

    A = fscanf(fid,'%d',4);  % Reads 4 integer values
    
  • FGETL/FGETS: In text mode, these will read whole lines from the file.

  • FTELL: This will tell you the current file position indicator in bytes from the beginning of the file:

    ftell(fid)
    ans =
         8    % The position indicator is 8 bytes from the file beginning
    
  • FSEEK: This will set the file position indicator to a desired position in the file:

    fseek(fid,0,-1);  % Moves the position indicator to the file beginning
    
  • FCLOSE: Each process will have to close its access to the file (it's easy to forget to do this):

    fclose(fid);
    

This solution will likely require that the input file has a well-structured format that is easy to traverse (i.e. just one large matrix). If it has lots of variable length fields then reading data from the correct position in the file could get very tricky.


If the processes have to also modify the data, this could get even more difficult. In general, you don't want a file/memory location being simultaneously written to by multiple processes, or written to by one process while another is reading from the same location, since unwanted behavior can result. In such a case, you would have to limit access to the file such that only one process at a time is operating on it. Other processes would have to wait until the first is done. A sample version of code that each process would have to run in such a case is:

processDone = false;
while ~processDone,
  if file_is_free(),  % A function to check that other processes are not
                      %   accessing the file
    fid = fopen(fileName,'r+');  % Open the file
    perform_process(fid);        % The computation this process has to do
    fclose(fid);                 % Close the file
    processDone = true;
  end
end

Synchronization mechanisms like these ("locks") can sometimes have a high overhead that reduces the overall parallel efficiency of the code.

like image 94
gnovice Avatar answered Sep 23 '22 02:09

gnovice


You may want to checkout my Matlab file-exchange submission "sharedmatrix" #28572. It allows a Matlab matrix to exist in shared memory, provided you are using some flavor of Unix. One could then attach the shared matrix in a body of a parfor or spmd, ie,

shmkey=12345;
sharedmatrix('clone',shmkey,X);
clear X;
spmd(8)
    X=sharedmatrix('attach',shmkey);
    % do something with X
    sharedmatrix('detach',shmkey,X);
end
sharedmatrix('free',shmkey);

Since X exists in shared memory for the body of the spmd (or parfor) it has no load time and no communication time. From the perspective of Matlab it is a newly created variable in the spmd (or parfor) body.

Cheers,

Josh

http://www.mathworks.com/matlabcentral/fileexchange/28572-sharedmatrix

like image 21
jvdillon Avatar answered Sep 23 '22 02:09

jvdillon