Fastest way of finding the only index of vector b where array A(i,j) == b

Tags:

I have 2 big arrays A and b:

A: 10.000++ rows, 4 columns, not unique integers
b: vector with 500.000++ elements, unique integers

Due to the uniqueness of the values of b, I need to find the only index of b, where A(i,j) == b.

What I started with is

[rows,columns] = size(A);
B = zeros(rows,columns);
for i = 1 : rows
    for j = 1 : columns
        B(i,j) = find(A(i,j)==b,1);
    end
end

This takes approx 5.5 seconds to compute, which is way to long, since A and b can be significantly bigger... That in mind I tried to speed up the code by using logical indexing and reducing the for-loops

[rows,columns] = size(A);
B = zeros(rows,columns);
for idx = 1 : numel(b)
    B(A==b(idx)) = idx;
end

Sadly this takes even longer: 21 seconds

I even tried to do use bsxfun

for i = 1 : columns
   [I,J] = find(bsxfun(@eq,A(:,i),b))
    ... stitch B together ...
end

but with a bigger arrays the maximum array size is quickly exceeded (102,9GB...).

Can you help me find a faster solution to this? Thanks in advance!

EDIT: I extended find(A(i,j)==b,1), which speeds up the algorithm by factor 2! Thank you, but overall still too slow... ;)

483

asked May 31 '18 16:05

Johannes

1 Answers

The function ismember is the right tool for this:

[~,B] = ismember(A,b);

Test code:

function so
  A = rand(1000,4);
  b = unique([A(:);rand(2000,1)]);

  B1 = op1(A,b);
  B2 = op2(A,b);
  isequal(B1,B2)

  tic;op1(A,b);op1(A,b);op1(A,b);op1(A,b);toc
  tic;op2(A,b);op2(A,b);op2(A,b);op2(A,b);toc
end

function B = op1(A,b)
  B = zeros(size(A));
  for i = 1:numel(A)
    B(i) = find(A(i)==b,1);
  end
end

function B = op2(A,b)
  [~,B] = ismember(A,b);
end

I ran this on Octave, which is not as fast with loops as MATLAB. It also doesn't have the timeit function, hence the crappy timing using tic/toc (sorry for that). In Octave, op2 is more than 100 times faster than op1. Timings will be different in MATLAB, but ismember should still be the fastest option. (Note I also replaced your double loop with a single loop, this is the same but simpler and probably faster.)

If you want to repeatedly do the search in b, it is worthwhile to sort b first, and implement your own binary search. This will avoid the checks and sorting that ismember does. See this other question.

answered Oct 25 '22 12:10

Cris Luengo

Related questions
                            
                                Description and DetailedDescription attributes of MATLAB classes
                            
                                What does "shadows it in the MATLAB path" mean? How to do it in a file?
                            
                                How to crop and rotate an image to bounding box?
                            
                                Calling MATLAB from C++ errors: unresolved external symbol
                            
                                When can I pass a function handle?
                            
                                Is it possible to prevent an uitable popup menu from popping up? Or: How to get a callback by clicking a cell, returning the row & column index?
                            
                                Doesn't Matlab optimize the following?
                            
                                Set the transparency of bars in a bar plot and set the y-axis to a log scale - but both don't seem to work in MATLAB
                            
                                Iterate through a structure in MATLAB without 'fieldnames'
                            
                                MATLAB textbox in a constant position on top of spinning 3D plot?
                            
                                Creating stereoParameters class in Matlab: what coordinate system should be used for relative camera rotation parameter?
                            
                                Element-wise Matrix Replication in MATLAB
                            
                                Convert data to leveldb for caffe
                            
                                How do I interpret the orientation of the gradient when using imgradient in MATLAB?
                            
                                Block diagonal matrix from columns
                            
                                Negative zeros in Matlab
                            
                                Matlab load mat into variable
                            
                                How to use variables in Matlab App Designer in all callbacks [closed]
                            
                                Incrementally / gradually change pitch of signal over time using octave / matlab code
                            
                                Matlab class set method call unexpectedly

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Fastest way of finding the only index of vector b where array A(i,j) == b

Tags:

performance

find

matlab

Johannes

People also ask

1 Answers

Cris Luengo

Recent Activity

Donate For Us