Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cosine distance of vector to matrix

In python, is there a vectorized efficient way to calculate the cosine distance of a sparse array u to a sparse matrix v, resulting in an array of elements [1, 2, ..., n] corresponding to cosine(u,v[0]), cosine(u,v[1]), ..., cosine(u, v[n])?

like image 352
David Avatar asked Oct 19 '22 10:10

David


2 Answers

Not natively. You can however use the library scipy that can compute the cosine distance between two vectors for you: http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.spatial.distance.cosine.html. You can build a version that takes a matrix using this as a stepping stone.

like image 90
the blizz Avatar answered Oct 30 '22 15:10

the blizz


Add the vector onto the end of the matrix, calculate a pairwise distance matrix using sklearn.metrics.pairwise_distances() and then extract the relevant column/row.

So for vector v (with shape (D,)) and matrix m (with shape (N,D)) do:

import sklearn
from sklearn.metrics import pairwise_distances

new_m = np.concatenate([m,v[None,:]], axis=0)
distance_matrix = sklearn.metrics.pairwise_distances(new_m, axis=0), metric="cosine")
distances = distance_matrix[-1,:-1]

Not ideal, but better than iterating!

This method can be extended if you are querying more than one vector. To do this, a list of vectors can be concatenated instead.

like image 30
hank Avatar answered Oct 30 '22 14:10

hank