I am working with Matlab. I have a binary square matrix. For each row, there is one or more entries of 1. I want to go through each row of this matrix and return the index of those 1s and store them in the entry of a cell. I was wondering if there is a way to do this without looping over all the rows of this matrix, as for loop is really slow in Matlab. For example, my matrix <pre class="prettyprint"><code>M = 0 1 0 1 0 1 1 1 1 </code></pre> Then eventually, I want something like <pre class="prettyprint"><code>A = [2] [1,3] [1,2,3] </code></pre> So <code>A</code> is a cell. Is there a way to achieve this goal without using for loop, with the aim of calculating the result more quickly?

At the bottom of this answer is some benchmarking code, since you clarified that you're interested in performance rather than arbitrarily avoiding <code>for</code> loops. In fact, I think <code>for</code> loops are probably the most performant option here. Since the "new" (2015b) JIT engine was introduced (source) <code>for</code> loops are not inherently slow - in fact they are optimised internally. You can see from the benchmark that the <code>mat2cell</code> option offered by ThomasIsCoding here is very slow... <img src="https://i.stack.imgur.com/14duQ.png" alt="Comparison 1"> If we get rid of that line to make the scale clearer, then my <code>splitapply</code> method is fairly slow, obchardon's accumarray option is a bit better, but the fastest (and comparable) options are either using <code>arrayfun</code> (as also suggested by Thomas) or a <code>for</code> loop. Note that <code>arrayfun</code> is basically a <code>for</code> loop in disguise for most use-cases, so this isn't a surprising tie! <img src="https://i.stack.imgur.com/FrL35.png" alt="Comparison 2"> <strike>I would recommend you use a <code>for</code> loop for increased code readability and the best performance.</strike> Edit: If we assume that looping is the fastest approach, we can make some optimisations around the <code>find</code> command. Specifically <ul> <li>Make <code>M</code> logical. As the below plot shows, this can be faster for relatively small <code>M</code>, but slower with the trade-off of type conversion for large <code>M</code>.</li> <li>Use a logical <code>M</code> to index an array <code>1:size(M,2)</code> instead of using <code>find</code>. This avoids the slowest part of the loop (the <code>find</code> command) and outweighs the type conversion overhead, making it the quickest option.</li> </ul> Here is my recommendation for best performance: <pre class="prettyprint"><code>function A = f_forlooplogicalindexing( M ) M = logical(M); k = 1:size(M,2); N = size(M,1); A = cell(N,1); for r = 1:N A{r} = k(M(r,:)); end end </code></pre> I've added this to the benchmark below, here is the comparison of loop-style approaches: <img src="https://i.stack.imgur.com/zaZGd.png" alt="Comparison 3"> Benchmarking code: <pre class="prettyprint"><code>rng(904); % Gives OP example for randi([0,1],3) p = 2:12; T = NaN( numel(p), 7 ); for ii = p N = 2^ii; M = randi([0,1],N); fprintf( 'N = 2^%.0f = %.0f\n', log2(N), N ); f1 = @()f_arrayfun( M ); f2 = @()f_mat2cell( M ); f3 = @()f_accumarray( M ); f4 = @()f_splitapply( M ); f5 = @()f_forloop( M ); f6 = @()f_forlooplogical( M ); f7 = @()f_forlooplogicalindexing( M ); T(ii, 1) = timeit( f1 ); T(ii, 2) = timeit( f2 ); T(ii, 3) = timeit( f3 ); T(ii, 4) = timeit( f4 ); T(ii, 5) = timeit( f5 ); T(ii, 6) = timeit( f6 ); T(ii, 7) = timeit( f7 ); end plot( (2.^p).', T(2:end,:) ); legend( {'arrayfun','mat2cell','accumarray','splitapply','for loop',... 'for loop logical', 'for loop logical + indexing'} ); grid on; xlabel( 'N, where M = random N*N matrix of 1 or 0' ); ylabel( 'Execution time (s)' ); disp( 'Done' ); function A = f_arrayfun( M ) A = arrayfun(@(r) find(M(r,:)),1:size(M,1),'UniformOutput',false); end function A = f_mat2cell( M ) [i,j] = find(M.'); A = mat2cell(i,arrayfun(@(r) sum(j==r),min(j):max(j))); end function A = f_accumarray( M ) [val,ind] = ind2sub(size(M),find(M.')); A = accumarray(ind,val,[],@(x) {x}); end function A = f_splitapply( M ) [r,c] = find(M); A = splitapply( @(x) {x}, c, r ); end function A = f_forloop( M ) N = size(M,1); A = cell(N,1); for r = 1:N A{r} = find(M(r,:)); end end function A = f_forlooplogical( M ) M = logical(M); N = size(M,1); A = cell(N,1); for r = 1:N A{r} = find(M(r,:)); end end function A = f_forlooplogicalindexing( M ) M = logical(M); k = 1:size(M,2); N = size(M,1); A = cell(N,1); for r = 1:N A{r} = k(M(r,:)); end end </code></pre>

Matlab Vectorization - none-zero matrix row indices to cell

Tags:

vectorization

matlab

I am working with Matlab.

I have a binary square matrix. For each row, there is one or more entries of 1. I want to go through each row of this matrix and return the index of those 1s and store them in the entry of a cell.

I was wondering if there is a way to do this without looping over all the rows of this matrix, as for loop is really slow in Matlab.

For example, my matrix

M = 0 1 0
    1 0 1
    1 1 1

Then eventually, I want something like

A = [2]
    [1,3]
    [1,2,3]

So A is a cell.

Is there a way to achieve this goal without using for loop, with the aim of calculating the result more quickly?

659

asked Feb 10 '20 09:02

ftxx

1 Answers

At the bottom of this answer is some benchmarking code, since you clarified that you're interested in performance rather than arbitrarily avoiding for loops.

In fact, I think for loops are probably the most performant option here. Since the "new" (2015b) JIT engine was introduced (source) for loops are not inherently slow - in fact they are optimised internally.

You can see from the benchmark that the mat2cell option offered by ThomasIsCoding here is very slow...

Comparison 1

If we get rid of that line to make the scale clearer, then my splitapply method is fairly slow, obchardon's accumarray option is a bit better, but the fastest (and comparable) options are either using arrayfun (as also suggested by Thomas) or a for loop. Note that arrayfun is basically a for loop in disguise for most use-cases, so this isn't a surprising tie!

Comparison 2

~~I would recommend you use a for loop for increased code readability and the best performance.~~

Edit:

If we assume that looping is the fastest approach, we can make some optimisations around the find command.

Specifically

Make M logical. As the below plot shows, this can be faster for relatively small M, but slower with the trade-off of type conversion for large M.
Use a logical M to index an array 1:size(M,2) instead of using find. This avoids the slowest part of the loop (the find command) and outweighs the type conversion overhead, making it the quickest option.

Here is my recommendation for best performance:

function A = f_forlooplogicalindexing( M )
    M = logical(M);
    k = 1:size(M,2);
    N = size(M,1);
    A = cell(N,1);
    for r = 1:N
        A{r} = k(M(r,:));
    end
end

I've added this to the benchmark below, here is the comparison of loop-style approaches:

Comparison 3

Benchmarking code:

rng(904); % Gives OP example for randi([0,1],3)
p = 2:12; 
T = NaN( numel(p), 7 );
for ii = p
    N = 2^ii;
    M = randi([0,1],N);

    fprintf( 'N = 2^%.0f = %.0f\n', log2(N), N );

    f1 = @()f_arrayfun( M );
    f2 = @()f_mat2cell( M );
    f3 = @()f_accumarray( M );
    f4 = @()f_splitapply( M );
    f5 = @()f_forloop( M );
    f6 = @()f_forlooplogical( M );
    f7 = @()f_forlooplogicalindexing( M );

    T(ii, 1) = timeit( f1 ); 
    T(ii, 2) = timeit( f2 ); 
    T(ii, 3) = timeit( f3 ); 
    T(ii, 4) = timeit( f4 );  
    T(ii, 5) = timeit( f5 );
    T(ii, 6) = timeit( f6 );
    T(ii, 7) = timeit( f7 );
end

plot( (2.^p).', T(2:end,:) );
legend( {'arrayfun','mat2cell','accumarray','splitapply','for loop',...
         'for loop logical', 'for loop logical + indexing'} );
grid on;
xlabel( 'N, where M = random N*N matrix of 1 or 0' );
ylabel( 'Execution time (s)' );

disp( 'Done' );

function A = f_arrayfun( M )
    A = arrayfun(@(r) find(M(r,:)),1:size(M,1),'UniformOutput',false);
end
function A = f_mat2cell( M )
    [i,j] = find(M.');
    A = mat2cell(i,arrayfun(@(r) sum(j==r),min(j):max(j)));
end
function A = f_accumarray( M )
    [val,ind] = ind2sub(size(M),find(M.'));
    A = accumarray(ind,val,[],@(x) {x});
end
function A = f_splitapply( M )
    [r,c] = find(M);
    A = splitapply( @(x) {x}, c, r );
end
function A = f_forloop( M )
    N = size(M,1);
    A = cell(N,1);
    for r = 1:N
        A{r} = find(M(r,:));
    end
end
function A = f_forlooplogical( M )
    M = logical(M);
    N = size(M,1);
    A = cell(N,1);
    for r = 1:N
        A{r} = find(M(r,:));
    end
end
function A = f_forlooplogicalindexing( M )
    M = logical(M);
    k = 1:size(M,2);
    N = size(M,1);
    A = cell(N,1);
    for r = 1:N
        A{r} = k(M(r,:));
    end
end

198

answered Oct 18 '22 18:10

Wolfie

Related questions
                            
                                Reading date and time from CSV file in MATLAB
                            
                                Write a MAT file without using matlab headers and libraries
                            
                                Force matlab gui to update ui control mid-function
                            
                                What is a fast way to compute column by column correlation in matlab
                            
                                Counting colonies on a Petri dish
                            
                                Can a Matlab PARFOR loop be programmatically switched on/off?
                            
                                Draw a line through two points
                            
                                10 fold cross-validation in one-against-all SVM (using LibSVM)
                            
                                Extract array dimensions in Julia
                            
                                Vectorizing sums of different diagonals in a matrix
                            
                                MATLAB - Missing fundamental from an FFT [closed]
                            
                                Several MATLAB command windows possible?
                            
                                extract first 4 letters from a string in matlab
                            
                                Redirecting MATLAB's disp to a text string
                            
                                How to find all connected components in a binary image in Matlab?
                            
                                Get current figure size in MATLAB
                            
                                How can I list global variables in MATLAB?
                            
                                Escape sequence to display apostrophe in MATLAB
                            
                                Creating 3D volume from 2D slice set of grayscale images
                            
                                Plot a contour of multivariate normal PDF of a given MVN in MATLAB?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With