Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

cross pair-wise distance measurement across two modalities

I have a question. I am trying to compute pairwise distances between vectors. Let me first explain the problem: I have two sets of vectors X and Y. X has three vectors x1, x2 and x3. Y has three vectors y1, y2 and y3. Note vectors in X and Y are of length m and n respectively. Let the dataset be represented as this image:

enter image description here

I am trying to compute a similarity matrix such as this:

enter image description here. Now the different colour coded parts are explained - All those cells marked with 0 need not be computed. I have intentionally put it as 100 (it can be any value). The grey cells have to be computed. The similarity score is computed as the L2norm of (xi-xj) + L2 norm of (yi-yj).

Which means the entries are

M((x_i,y_j), (x_k,y_l)) := norm(x_i-x_k,2) + norm(y_j-y_l,2)

I have written a basic code to do this:

clc;clear all;close all;
%% randomly generate data
m=3; n1=4; n2=6;
train_a_mean = rand(m,n1);
train_b_mean = rand(m,n2);
p = size(train_a_mean,1)*size(train_b_mean,1);
score_mean_ab = zeros(p,p);

%% This is to store the index variables 
%% This is required for futu
idx1 = score_mean_ab;
idx2 = idx1; idx3 = idx1; idx4 = idx1;

a=1; b=1;
for i=1:size(score_mean_ab,1)
    c = 1; d = 1;
    for j=1:size(score_mean_ab,2)
        if (a==c)
            score_mean_ab(i,j) = 100;
        else            
            %% computing distances between the different modalities and
            %% summing them up
            score_mean_ab(i,j) = norm(train_a_mean(a,:)-train_a_mean(c,:),2) ...
            + norm(train_b_mean(b,:)-train_b_mean(d,:),2);
        end
        %% saving the indices
        idx1(i,j)=a; idx2(i,j)=b; idx3(i,j)=c; idx4(i,j)=d;        
        %% updating the values of c and d
        if mod(d,size(train_a_mean,1))==0
            c = c + 1;
            d = 1;
        else
            d = d+1;
        end
    end
    %% updating the values of a and b
    if mod(b,size(train_a_mean,1))==0
        a = a + 1;
        b = 1;
    else        
        b = b+1;
    end
end

For a dry sample run of the matrix: I get these results -

score_mean_ab =

  100.0000  100.0000  100.0000    0.6700    1.6548    1.5725    0.8154    1.8002    1.7179
  100.0000  100.0000  100.0000    1.6548    0.6700    1.5000    1.8002    0.8154    1.6454
  100.0000  100.0000  100.0000    1.5725    1.5000    0.6700    1.7179    1.6454    0.8154
    0.6700    1.6548    1.5725  100.0000  100.0000  100.0000    1.3174    2.3022    2.2200
    1.6548    0.6700    1.5000  100.0000  100.0000  100.0000    2.3022    1.3174    2.1475
    1.5725    1.5000    0.6700  100.0000  100.0000  100.0000    2.2200    2.1475    1.3174
    0.8154    1.8002    1.7179    1.3174    2.3022    2.2200  100.0000  100.0000  100.0000
    1.8002    0.8154    1.6454    2.3022    1.3174    2.1475  100.0000  100.0000  100.0000
    1.7179    1.6454    0.8154    2.2200    2.1475    1.3174  100.0000  100.0000  100.0000

However my code is very slow. I took a very few sample runs and got these results:

m=3; n1=3; n2=3;
Elapsed time is 0.000363 seconds.

m=10; n1=3; n2=3;
Elapsed time is 0.042015 seconds.

m=10; n1=1800; n2=1800;
Elapsed time is 0.230046 seconds.

m=20; n1=1800; n2=1800;
Elapsed time is 4.309134 seconds.

m=30; n1=1800; n2=1800;
Elapsed time is 23.058106 seconds.

My Questions :

  1. Typically I will have values of m~100 and n1~2000 and n2~2000. My own code breaks down at this point. Is there any optimised way to do this ?
  2. Can the inbuilt matlab function pdist2 be used for this purpose?

NOTE: The vectors are actually in the form of row vectors and the value of n1 and n2 may not be equal.

like image 839
roni Avatar asked Dec 11 '25 13:12

roni


1 Answers

Here's a way to do it. This computes all entries.

m = 3;             %// number of (row) vectors in X and in Y
n1 = 3;            %// length of vectors in X
n2 = 3;            %// length of vectors in Y
X = rand(m, n1);   %// random data: X
Y = rand(m, n2);   %// random data: Y

[ii, jj] = ndgrid(1:m); 
U = reshape(sqrt(sum((X(ii,:)-X(jj,:)).^2, 2)), m, m);
V = reshape(sqrt(sum((Y(ii,:)-Y(jj,:)).^2, 2)), m, m);
result = U(ceil(1/m:1/m:m), ceil(1/m:1/m:m)) + repmat(V, m, m);

Or you could use bsxfun instead of ndgrid:

U = sqrt(sum(bsxfun(@minus, permute(X, [1 3 2]), permute(X, [3 1 2])).^2, 3));
V = sqrt(sum(bsxfun(@minus, permute(Y, [1 3 2]), permute(Y, [3 1 2])).^2, 3));
result = U(ceil(1/m:1/m:m), ceil(1/m:1/m:m)) + repmat(V, m, m);
like image 86
Luis Mendo Avatar answered Dec 14 '25 11:12

Luis Mendo



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!