Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MATLAB, how to compute the distances between large coordinate sets, without exceeding memory constraints?

(using MATLAB) I have a large coordinate matrix and a large sparse adjacency matrix for which coordinates are connected to each other. I had asked previously on SO, how to efficiently compute these distances in this SO question, but I have run into memory problems now which are a much more severe issue.

I used this MATLAB function to compute the distance matrix Dists = pdist2(fruchterman_graph(:,:),fruchterman_graph(:,:),'euclidean'); but it fails on large networks for both speed and finally memory.

This is the code which runs only on smallish graphs (not hundreds of thousands):

coordinate = kamada_graph;
    [n, d] = size(kamada_graph);
    assert(d == 2);
    resi = sparse(adj*  spdiags((1:n)',0,n,n));
    resj = sparse(spdiags((1:n)',0,n,n) * adj);
    res = sparse(n,n);
    f = find(adj);  
    res(f) = sqrt((coordinate(resi(f), 1) - coordinate(resj(f), 1)) .^ 2 +...
                          (coordinate(resi(f), 2) - coordinate(resj(f), 2)) .^ 2);

This on large graphs creates

??? Error using ==> find Matrix is too large to return linear indices.
Use [i,j] = find(S) for sparse matrix.
Error in ==> modularize_graphs at 49 [f] = find(adj);

I changed the line which is referred to to be:

[i,j] = find(ajd);    
res(i,j) = sqrt((coordinate(resi(i,j), 1) - coordinate(resj(i,j), 1)) .^ 2 +...
    (coordinate(resi(i,j), 2) - coordinate(resj(i,j), 2)) .^ 2);

and on the smallish network now (~500 vertices) the error:

??? Out of memory.
Type HELP MEMORY for your options.
Error in ==> modularize_graphs at 50
res(i,j) = sqrt((coordinate(resi(i,j), 1) - coordinate(resj(i,j), 1)) .^ 2 +...

Is there anyway to compute the distance matrix using the adjacency matrix and a coordinate matrix, (N,2) of x,y values, without falling into memory issues and possibly making it not too slow as well?

The desired output is a matrix of distances, the distances between all points that are connected according to the Adj adjacency matrix.

like image 447
Vass Avatar asked Apr 04 '13 14:04

Vass


People also ask

How do you find the distance between coordinates in Matlab?

distance=pdist(pair,'euclidean'); "distance" will give you the euclidean distance between the first and second coordinates.

How to compute pairwise distance?

The pairwise distance between observations i and j is in D((i-1)*(m-i/2)+j-i) for i≤j. You can convert D into a symmetric matrix by using the squareform function. Z = squareform(D) returns an m-by-m matrix where Z(i,j) corresponds to the pairwise distance between observations i and j.

What is pairwise distance matrix?

In mathematics, computer science and especially graph theory, a distance matrix is a square matrix (two-dimensional array) containing the distances, taken pairwise, between the elements of a set. Depending upon the application involved, the distance being used to define this matrix may or may not be a metric.


1 Answers

To calculate pointwise distances with using a minimum amount of memory, you can always iterate through your adjacency matrix on an element-wise basis:

%# assuming a square and symmetric adjacency matrix
nCoords = size(adjMat,1);

%# there are at most (n^2-n) distances 
%# if this runs out of memory already, there 
%# is a way to only store existing distances to save 
%# even more memory
distances = NaN((nCoords^2-nCoords)/2,1);

ct = 0;
for row = 1:nCoords
   for col = 1:row-1
     ct = ct+1;
     if adjacencyMatrix(row,col)
        distances(ct) = sum( (coordinate(row,:)-coordinate(col,:)).^2 );
     end
   end
end

distances = sqrt(distances);

With the sparse approach, you may want to try the following (I don't think you need resi and resj, unless I totally misunderstand your problem).

[row,col]=find(adjacencyMatrix);

distances = sqrt(sum( (coordinate(row,:) - coordinate(col,:)).^2 ,2));

sparseDistanceMatrix = sparse(row,col,distances);
like image 68
Jonas Avatar answered Oct 23 '22 17:10

Jonas