Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

vectorize lookup values in table of interval limits

Here is a question about whether we can use vectorization type of operation in matlab to avoid writing for loop.

I have a vector

Q = [0.1,0.3,0.6,1.0]

I generate a uniformly distributed random vector over [0,1)

X = [0.11,0.72,0.32,0.94]

I want to know whether each entry of X is between [0,0.1) or [0.1,0.3) or [0.3,0.6), or [0.6,1.0) and I want to return a vector which contains the index of the maximum element in Q that each entry of X is less than.

I could write a for loop

Y = zeros(length(X),1)
for i = 1:1:length(X)
    Y(i) = find(X(i)<Q, 1);
end

Expected result for this example:

Y = [2,4,3,4]

But I wonder if there is a way to avoid writing for loop? (I see many very good answers to my question. Thank you so much! Now if we go one step further, what if my Q is a matrix, such that I want check whether )

Y = zeros(length(X),1)
for i = 1:1:length(X)
    Y(i) = find(X(i)<Q(i), 1);
end
like image 255
ftxx Avatar asked Aug 26 '16 19:08

ftxx


3 Answers

Use the second output of max, which acts as a sort of "vectorized find":

[~, Y] = max(bsxfun(@lt, X(:).', Q(:)), [], 1);

How this works:

  1. For each element of X, test if it is less than each element of Q. This is done with bsxfun(@lt, X(:).', Q(:)). Note each column in the result corresponds to an element of X, and each row to an element of Q.
  2. Then, for each element of X, get the index of the first element of Q for which that comparison is true. This is done with [~, Y] = max(..., [], 1). Note that the second output of max returns the index of the first maximizer (along the specified dimension), so in this case it gives the index of the first true in each column.

For your example values,

Q = [0.1, 0.3, 0.6, 1.0];
X = [0.11, 0.72, 0.32, 0.94];
[~, Y] = max(bsxfun(@lt, X(:).', Q(:)), [], 1);

gives

Y =
     2     4     3     4
like image 155
Luis Mendo Avatar answered Nov 15 '22 04:11

Luis Mendo


Octave has a function lookup to do exactly that. It takes a lookup table of sorted values and an array, and returns an array with indices for values in the lookup table.

octave> Q = [0.1 0.3 0.6 1.0];
octave> x = [0.11 0.72 0.32 0.94];
octave> lookup (Q, X)
ans =

   1   3   2   3

The only issue is that your lookup table has an implicit zero which be fixed easily with:

octave> lookup ([0 Q], X) # alternatively, just add 1 at the results
ans =

   2   4   3   4
like image 38
carandraug Avatar answered Nov 15 '22 04:11

carandraug


Using bsxfun will help accomplish this. You'll need to read about it. I also added a Q = 0 at the beginning to handle the small X case

X = [0.11,0.72,0.32,0.94 0.01];
Q = [0.1,0.3,0.6,1.0];
Q_extra = [0 Q];

Diff = bsxfun(@minus,X(:)',Q_extra (:)); %vectorized subtraction
logical_matrix = diff(Diff < 0); %find the transition from neg to positive
[X_categories,~] = find(logical_matrix == true); % get indices

% output is 2 4 3 4 1

EDIT: How long does each method take?

I got curious about the difference between each solution:

Test Code Below:

Q = [0,0.1,0.3,0.6,1.0];

X = rand(1,1e3);

tic
Y = zeros(length(X),1);
for i = 1:1:length(X)
    Y(i) = find(X(i)<Q, 1);
end
toc
tic
result = arrayfun(@(x)find(x < Q, 1), X);
toc

tic
Q = [0 Q];
Diff = bsxfun(@minus,X(:)',Q(:)); %vectorized subtraction
logical_matrix = diff(Diff < 0); %find the transition from neg to positive
[X_categories,~] = find(logical_matrix == true); % get indices
toc

Run it for yourself, I found that when the size of X was 1e6, bsxfun was much faster, while for smaller arrays the differences were varying and negligible.

SAMPLE: when size X was 1e3

Elapsed time is 0.001582 seconds. % for loop
Elapsed time is 0.007324 seconds. % anonymous function
Elapsed time is 0.000785 seconds. % bsxfun
like image 22
Trogdor Avatar answered Nov 15 '22 05:11

Trogdor