Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to compute cosine similarity using two matrices

I have two matrices, A (dimensions M x N) and B (N x P). In fact, they are collections of vectors - row vectors in A, column vectors in B. I want to get cosine similarity scores for every pair a and b, where a is a vector (row) from matrix A and b is a vector (column) from matrix B.

I have started by multiplying the matrices, which results in matrix C (dimensions M x P).

C = A*B

However, to obtain cosine similarity scores, I need to divide each value C(i,j) by the norm of the two corresponding vectors. Could you suggest the easiest way to do this in Matlab?

like image 325
John Manak Avatar asked Jan 15 '13 14:01

John Manak


People also ask

How do you find the cosine similarity between two documents?

Cosine similarity measures the similarity between two vectors of an inner product space. It is measured by the cosine of the angle between two vectors and determines whether two vectors are pointing in roughly the same direction. It is often used to measure document similarity in text analysis.

How do you find the cosine similarity between two Numpy arrays?

Using numpy. array()function we will create x & y arrays of the same length. In the above code, we import numpy package to use dot() and norm() functions to calculate Cosine Similarity in python. Using dot(x, y)/(norm(x)*norm(y)) , we calculate the cosine similarity between two vectors x & y in python.

How do you find the cosine similarity between two documents in Python?

From Python: tf-idf-cosine: to find document similarity , it is possible to calculate document similarity using tf-idf cosine.


1 Answers

The simplest solution would be computing the norms first using element-wise multiplication and summation along the desired dimensions:

normA = sqrt(sum(A .^ 2, 2));
normB = sqrt(sum(B .^ 2, 1));

normA and normB are now a column vector and row vector, respectively. To divide corresponding elements in A * B by normA and normB, use bsxfun like so:

C = bsxfun(@rdivide, bsxfun(@rdivide, A * B, normA), normB);
like image 181
Eitan T Avatar answered Sep 19 '22 04:09

Eitan T