Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

matlab array create random consecutive ones

I would like to create an 4-dimensional array with a random number of consecutive ones in each row. The ones should always start in the first column and end in a random column. Example:

array(:,:,1,1) = [ 1 1 1 0 0 0;
                   1 1 0 0 0 0;
                   1 1 1 1 1 0;
                   ...         ]

One could do that with 3 for loops but it is inefficient:

array = zeros(n,n,n,n);
for i= 1:n
   for j = 1:n
      for k =1:n
         rows = ceil(n*rand());
         array(k,1:rows,j,i) = 1;
      end
   end
end

Can somebody find a better solution? Thanks!!

like image 317
user3444108 Avatar asked Mar 21 '23 03:03

user3444108


2 Answers

Simple and straight-up approach (might not be truly random though)

rows = 8; %%// Number of rows
cols = 7; %%// Number of columns
ch3 = 3; %%// Number of elements in the 3rd dimension
ch4 = 2; %%// Number of elements in the 4th dimension

array = sort(round(rand(rows,cols,ch3,ch4)),2,'descend')

Bsxfun approach (Much faster as shown in benchmark results below and truly random)

%%// Sizes
rows = 8; %%// Number of rows
cols = 7; %%// Number of columns
ch3 = 3; %%// Number of elements in the 3rd dimension
ch4 = 2; %%// Number of elements in the 4th dimension

%%// Get a 2D array with every row starting with a one
rows2 = rows*ch3*ch4;
a1 = reshape(1:rows2*cols,rows2,[]);
col = randi(cols,[1 rows2]);
b1 = bsxfun(@plus,(col-1)*rows2,1:rows2)';%//'
out = bsxfun(@le,a1,b1);

%%// Rearrange those to 4D
a2 = reshape(out',[rows*cols ch3 ch4]);%//'
a3 = reshape(a2,cols,rows,ch3,ch4);
array = permute(a3,[2 1 3 4]);

Benchmark results

We are comparing the above mentioned two approaches along with Rody's approach.

Datasize I:
rows = 80;  %// Number of rows
cols = 70;  %// Number of columns
ch3  = 30;  %// Number of elements in the 3rd dimension
ch4  = 2; %// Number of elements in the 4th dimension

Results:
Elapsed time with SORT approach is:   0.0083445sec
Elapsed time with BSXFUN approach is: 0.0021sec
Elapsed time with RODY approach is:   0.0063026sec

Datasize II:
rows = 80;  %// Number of rows
cols = 70;  %// Number of columns
ch3  = 30;  %// Number of elements in the 3rd dimension
ch4  = 20; %// Number of elements in the 4th dimension

Results:
Elapsed time with SORT approach is:   0.07875sec
Elapsed time with BSXFUN approach is: 0.012329sec
Elapsed time with RODY approach is:   0.055937sec


Datasize III:
rows = 800;  %// Number of rows
cols = 70;  %// Number of columns
ch3  = 30;  %// Number of elements in the 3rd dimension
ch4  = 20; %// Number of elements in the 4th dimension

Results:
Elapsed time with SORT approach is:   0.87257sec
Elapsed time with BSXFUN approach is: 0.17624sec
Elapsed time with RODY approach is:   0.57786sec


Datasize IV:
rows = 800;  %// Number of rows
cols = 140;  %// Number of columns
ch3  = 30;  %// Number of elements in the 3rd dimension
ch4  = 20; %// Number of elements in the 4th dimension

Results:
Elapsed time with SORT approach is:   1.8508sec
Elapsed time with BSXFUN approach is: 0.35349sec
Elapsed time with RODY approach is:   0.71918sec

In conclusion from these findings, bsxfun looks like the way to go, unless you want to process billions of elements, as memory-benchmark results produced by Rody's solution suggests.

like image 172
Divakar Avatar answered Mar 22 '23 19:03

Divakar


A "big-data" alternative:

rows = 80;  %// Number of rows
cols = 70;  %// Number of columns
ch3  = 30;  %// Number of elements in the 3rd dimension
ch4  = 200; %// Number of elements in the 4th dimension

%// number of 1's in each row. Select the appropriate class to make 
%// sure peak memory remains acceptably small
intClasses = {'uint8' 'uint16' 'uint32' 'uint64'}; 
maxSizes   = cellfun(@(x) double(intmax(x)), intClasses);
numOnes    = randi(cols, rows*ch3*ch4,1, ...
                   intClasses{find(cols <= maxSizes, 1,'first')});

clear intClasses maxSizes   

%// Loop through all rows and flip the appropriate number of bits
array = false(rows*ch3*ch4, cols);
for ii  = 1:numel(numOnes)
    array(ii,1:numOnes(ii)) = true; end

clear ii numOnes    

%// Reshape into desired dimensions
array = reshape(array.', cols,rows,ch3,ch4);
array = permute(array, [2 1 3 4]);

clear rows cols ch3 ch4

(You could also use a sparse, but MATLAB only supports sparse for data of type double...In all probability, the logical array consumes less memory. If you're really desperate: the numOnes array alone contains sufficient information to retrieve either a 1 or 0; there's no need to construct the actual array. So, you could write a small wrapper class that implements the subsref in this way, basically, creating your own specific sparse that way :)

Note that this is a lot slower than Divakar's solution when the array is small. However, then the array sizes increase, this becomes progressively faster than Divakar's solution, with a lot smaller peak memory footprint.

EDIT: Divakar's bsxfun approach is the fastest of them all, however, it has O(N²) memory complexity. I.e., it only works if you can fit

`(cols-1)·rows²·ch3²·ch4² + (cols+1)·rows·ch3·ch4

(plus a few) 8-byte double values in your RAM, as opposed to just cols·rows·ch3·ch4 1-byte booleans for my method ^_^

As a simple comparison: re-running the same test as Divakar above, but within profile -memory:

Divakar's method:

Divakar's method

My method:

Rody' method

As you can see, my method's peak memory equals the size of the output array (~130MB), while the overall speed the doesn't really suffer all that much, whereas Divakar's method has a memory footprint of 10× the output size (~1.4GB).

like image 25
Rody Oldenhuis Avatar answered Mar 22 '23 17:03

Rody Oldenhuis