I am working with Matlab.
I have a binary square matrix. For each row, there is one or more entries of 1. I want to go through each row of this matrix and return the index of those 1s and store them in the entry of a cell.
I was wondering if there is a way to do this without looping over all the rows of this matrix, as for loop is really slow in Matlab.
For example, my matrix
M = 0 1 0
1 0 1
1 1 1
Then eventually, I want something like
A = [2]
[1,3]
[1,2,3]
So A
is a cell.
Is there a way to achieve this goal without using for loop, with the aim of calculating the result more quickly?
A = sprand(10,10,0.07); Use nonzeros to find the values of the nonzero elements. Use nnz to count the number of nonzeros. Use find to get the indices and values of the nonzeros.
You can detect all zeros with data ==0 which will give you a boolean array and then perform np. any along each row on it. Alternatively, you can detect all non-zeros with data!= 0 and then do np.
k = find( X , n ) returns the first n indices corresponding to the nonzero elements in X . k = find( X , n , direction ) , where direction is 'last' , finds the last n indices corresponding to nonzero elements in X .
Given the matrix A=[1,2;0,0]; To remove the rows of 0 , you can: sum the absolute value of each rows (to avoid having a zero sum from a mix of negative and positive numbers), which gives you a column vector of the row sums. keep the index of each line where the sum is non-zero.
At the bottom of this answer is some benchmarking code, since you clarified that you're interested in performance rather than arbitrarily avoiding for
loops.
In fact, I think for
loops are probably the most performant option here. Since the "new" (2015b) JIT engine was introduced (source) for
loops are not inherently slow - in fact they are optimised internally.
You can see from the benchmark that the mat2cell
option offered by ThomasIsCoding here is very slow...
If we get rid of that line to make the scale clearer, then my splitapply
method is fairly slow, obchardon's accumarray option is a bit better, but the fastest (and comparable) options are either using arrayfun
(as also suggested by Thomas) or a for
loop. Note that arrayfun
is basically a for
loop in disguise for most use-cases, so this isn't a surprising tie!
I would recommend you use a for
loop for increased code readability and the best performance.
Edit:
If we assume that looping is the fastest approach, we can make some optimisations around the find
command.
Specifically
Make M
logical. As the below plot shows, this can be faster for relatively small M
, but slower with the trade-off of type conversion for large M
.
Use a logical M
to index an array 1:size(M,2)
instead of using find
. This avoids the slowest part of the loop (the find
command) and outweighs the type conversion overhead, making it the quickest option.
Here is my recommendation for best performance:
function A = f_forlooplogicalindexing( M )
M = logical(M);
k = 1:size(M,2);
N = size(M,1);
A = cell(N,1);
for r = 1:N
A{r} = k(M(r,:));
end
end
I've added this to the benchmark below, here is the comparison of loop-style approaches:
Benchmarking code:
rng(904); % Gives OP example for randi([0,1],3)
p = 2:12;
T = NaN( numel(p), 7 );
for ii = p
N = 2^ii;
M = randi([0,1],N);
fprintf( 'N = 2^%.0f = %.0f\n', log2(N), N );
f1 = @()f_arrayfun( M );
f2 = @()f_mat2cell( M );
f3 = @()f_accumarray( M );
f4 = @()f_splitapply( M );
f5 = @()f_forloop( M );
f6 = @()f_forlooplogical( M );
f7 = @()f_forlooplogicalindexing( M );
T(ii, 1) = timeit( f1 );
T(ii, 2) = timeit( f2 );
T(ii, 3) = timeit( f3 );
T(ii, 4) = timeit( f4 );
T(ii, 5) = timeit( f5 );
T(ii, 6) = timeit( f6 );
T(ii, 7) = timeit( f7 );
end
plot( (2.^p).', T(2:end,:) );
legend( {'arrayfun','mat2cell','accumarray','splitapply','for loop',...
'for loop logical', 'for loop logical + indexing'} );
grid on;
xlabel( 'N, where M = random N*N matrix of 1 or 0' );
ylabel( 'Execution time (s)' );
disp( 'Done' );
function A = f_arrayfun( M )
A = arrayfun(@(r) find(M(r,:)),1:size(M,1),'UniformOutput',false);
end
function A = f_mat2cell( M )
[i,j] = find(M.');
A = mat2cell(i,arrayfun(@(r) sum(j==r),min(j):max(j)));
end
function A = f_accumarray( M )
[val,ind] = ind2sub(size(M),find(M.'));
A = accumarray(ind,val,[],@(x) {x});
end
function A = f_splitapply( M )
[r,c] = find(M);
A = splitapply( @(x) {x}, c, r );
end
function A = f_forloop( M )
N = size(M,1);
A = cell(N,1);
for r = 1:N
A{r} = find(M(r,:));
end
end
function A = f_forlooplogical( M )
M = logical(M);
N = size(M,1);
A = cell(N,1);
for r = 1:N
A{r} = find(M(r,:));
end
end
function A = f_forlooplogicalindexing( M )
M = logical(M);
k = 1:size(M,2);
N = size(M,1);
A = cell(N,1);
for r = 1:N
A{r} = k(M(r,:));
end
end
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With