I have two matrices, A (dimensions M x N) and B (N x P). In fact, they are collections of vectors - row vectors in A, column vectors in B. I want to get cosine similarity scores for every pair a
and b
, where a
is a vector (row) from matrix A and b
is a vector (column) from matrix B.
I have started by multiplying the matrices, which results in matrix C
(dimensions M x P).
C = A*B
However, to obtain cosine similarity scores, I need to divide each value C(i,j)
by the norm of the two corresponding vectors. Could you suggest the easiest way to do this in Matlab?
Cosine similarity measures the similarity between two vectors of an inner product space. It is measured by the cosine of the angle between two vectors and determines whether two vectors are pointing in roughly the same direction. It is often used to measure document similarity in text analysis.
Using numpy. array()function we will create x & y arrays of the same length. In the above code, we import numpy package to use dot() and norm() functions to calculate Cosine Similarity in python. Using dot(x, y)/(norm(x)*norm(y)) , we calculate the cosine similarity between two vectors x & y in python.
From Python: tf-idf-cosine: to find document similarity , it is possible to calculate document similarity using tf-idf cosine.
The simplest solution would be computing the norms first using element-wise multiplication and summation along the desired dimensions:
normA = sqrt(sum(A .^ 2, 2));
normB = sqrt(sum(B .^ 2, 1));
normA
and normB
are now a column vector and row vector, respectively. To divide corresponding elements in A * B
by normA
and normB
, use bsxfun
like so:
C = bsxfun(@rdivide, bsxfun(@rdivide, A * B, normA), normB);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With