Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

scipy cdist with sparse matrices

I need to calculate the distances between two sets of vectors, source_matrix and target_matrix.

I have the following line, when both source_matrix and target_matrix are of type scipy.sparse.csr.csr_matrix:

distances = sp.spatial.distance.cdist(source_matrix, target_matrix)

And I end up getting the following partial exception traceback:

 File "/usr/local/lib/python2.7/site-packages/scipy/spatial/distance.py", line 2060, in cdist
    [XA] = _copy_arrays_if_base_present([_convert_to_double(XA)])
  File "/usr/local/lib/python2.7/site-packages/scipy/spatial/distance.py", line 146, in _convert_to_double
    X = X.astype(np.double)
ValueError: setting an array element with a sequence.

Which seem to indicate the sparse matrices are being treated as dense numpy matrices, which both fails and misses the point of using sparse matrices.

Any advice?

like image 576
NirIzr Avatar asked Oct 04 '16 03:10

NirIzr


1 Answers

I appreciate this post is quite old, but as one of the comments suggested, you could use the sklearn implementation which accepts sparse vectors and matrices.

Take two random vectors for example

a = scipy.sparse.rand(m=1,n=100,density=0.2,format='csr')
b = scipy.sparse.rand(m=1,n=100,density=0.2,format='csr')
sklearn.metrics.pairwise.pairwise_distances(X=a, Y=b, metric='euclidean')
>>> array([[ 3.14837228]]) # example output

Or even if a is a matrix and b is a vector:

a = scipy.sparse.rand(m=500,n=100,density=0.2,format='csr')
b = scipy.sparse.rand(m=1,n=100,density=0.2,format='csr')
sklearn.metrics.pairwise.pairwise_distances(X=a, Y=b, metric='euclidean')
>>> array([[ 2.9864606 ], # example output
   [ 3.33862248],
   [ 3.45803465],
   [ 3.15453179],
   ...

Scipy spatial.distance does not support sparse matrices, so sklearn would be the best choice here. You can also pass the n_jobs argument to sklearn.metrics.pairwise.pairwise_distances which distributes the computation if your vectors are very large.

Hope that helps

like image 176
PyRsquared Avatar answered Oct 06 '22 04:10

PyRsquared