(using MATLAB
) I have a large coordinate matrix and a large sparse adjacency matrix for which coordinates are connected to each other. I had asked previously on SO, how to efficiently compute these distances in this SO question, but I have run into memory problems now which are a much more severe issue.
I used this MATLAB function to compute the distance matrix Dists = pdist2(fruchterman_graph(:,:),fruchterman_graph(:,:),'euclidean');
but it fails on large networks for both speed and finally memory.
This is the code which runs only on smallish graphs (not hundreds of thousands):
coordinate = kamada_graph;
[n, d] = size(kamada_graph);
assert(d == 2);
resi = sparse(adj* spdiags((1:n)',0,n,n));
resj = sparse(spdiags((1:n)',0,n,n) * adj);
res = sparse(n,n);
f = find(adj);
res(f) = sqrt((coordinate(resi(f), 1) - coordinate(resj(f), 1)) .^ 2 +...
(coordinate(resi(f), 2) - coordinate(resj(f), 2)) .^ 2);
This on large graphs creates
??? Error using ==> find Matrix is too large to return linear indices.
Use[i,j] = find(S)
for sparse matrix.
Error in ==> modularize_graphs at 49[f] = find(adj)
;
I changed the line which is referred to to be:
[i,j] = find(ajd);
res(i,j) = sqrt((coordinate(resi(i,j), 1) - coordinate(resj(i,j), 1)) .^ 2 +...
(coordinate(resi(i,j), 2) - coordinate(resj(i,j), 2)) .^ 2);
and on the smallish network now (~500 vertices) the error:
??? Out of memory.
Type HELP MEMORY for your options.
Error in ==> modularize_graphs at 50res(i,j) = sqrt((coordinate(resi(i,j), 1) - coordinate(resj(i,j), 1)) .^ 2 +...
Is there anyway to compute the distance matrix using the adjacency matrix and a coordinate matrix, (N,2)
of x
,y
values, without falling into memory issues and possibly making it not too slow as well?
The desired output is a matrix of distances, the distances between all points that are connected according to the Adj
adjacency matrix.
distance=pdist(pair,'euclidean'); "distance" will give you the euclidean distance between the first and second coordinates.
The pairwise distance between observations i and j is in D((i-1)*(m-i/2)+j-i) for i≤j. You can convert D into a symmetric matrix by using the squareform function. Z = squareform(D) returns an m-by-m matrix where Z(i,j) corresponds to the pairwise distance between observations i and j.
In mathematics, computer science and especially graph theory, a distance matrix is a square matrix (two-dimensional array) containing the distances, taken pairwise, between the elements of a set. Depending upon the application involved, the distance being used to define this matrix may or may not be a metric.
To calculate pointwise distances with using a minimum amount of memory, you can always iterate through your adjacency matrix on an element-wise basis:
%# assuming a square and symmetric adjacency matrix
nCoords = size(adjMat,1);
%# there are at most (n^2-n) distances
%# if this runs out of memory already, there
%# is a way to only store existing distances to save
%# even more memory
distances = NaN((nCoords^2-nCoords)/2,1);
ct = 0;
for row = 1:nCoords
for col = 1:row-1
ct = ct+1;
if adjacencyMatrix(row,col)
distances(ct) = sum( (coordinate(row,:)-coordinate(col,:)).^2 );
end
end
end
distances = sqrt(distances);
With the sparse approach, you may want to try the following (I don't think you need resi
and resj
, unless I totally misunderstand your problem).
[row,col]=find(adjacencyMatrix);
distances = sqrt(sum( (coordinate(row,:) - coordinate(col,:)).^2 ,2));
sparseDistanceMatrix = sparse(row,col,distances);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With