I am trying to find the cosine similarity between 2 vectors (x,y Points) and I am making some silly error that I cannot nail down. Pardone me am a newbie and sorry if I am making a very simple error (which I very likely am).
Thanks for your help
public static double GetCosineSimilarity(List<Point> V1, List<Point> V2)
{
double sim = 0.0d;
int N = 0;
N = ((V2.Count < V1.Count)?V2.Count : V1.Count);
double dotX = 0.0d; double dotY = 0.0d;
double magX = 0.0d; double magY = 0.0d;
for (int n = 0; n < N; n++)
{
dotX += V1[n].X * V2[n].X;
dotY += V1[n].Y * V2[n].Y;
magX += Math.Pow(V1[n].X, 2);
magY += Math.Pow(V1[n].Y, 2);
}
return (dotX + dotY)/(Math.Sqrt(magX) * Math.Sqrt(magY));
}
Edit: Apart from syntax, my question was also to do with the logical construct given I am dealing with Vectors of differing lengths. Also, how is the above generalizable to vectors of m dimensions. Thanks
Cosine similarity measures the similarity between two vectors of an inner product space. It is measured by the cosine of the angle between two vectors and determines whether two vectors are pointing in roughly the same direction. It is often used to measure document similarity in text analysis.
The cosine similarity is the cosine of the angle between vectors. The vectors are typically non-zero and are within an inner product space. The cosine similarity is described mathematically as the division between the dot product of vectors and the product of the euclidean norms or magnitude of each vector.
Word2Vec is a model used to represent words into vectors. Then, the similarity value can be generated using the Cosine Similarity formula of the word vector values produced by the Word2Vec model.
If you are in 2-dimensions, then you can have vectors represented as (V1.X, V1.Y)
and (V2.X, V2.Y)
, then use
public static double GetCosineSimilarity(Point V1, Point V2) {
return (V1.X*V2.X + V1.Y*V2.Y)
/ ( Math.Sqrt( Math.Pow(V1.X,2)+Math.Pow(V1.Y,2))
Math.Sqrt( Math.Pow(V2.X,2)+Math.Pow(V2.Y,2))
);
}
If you are in higher dimensions then you can represent each vector as List<double>
. So, in 4-dimensions the first vector would have components V1 = (V1[0], V1[1], V1[2], V1[3])
.
public static double GetCosineSimilarity(List<double> V1, List<double> V2)
{
int N = 0;
N = ((V2.Count < V1.Count) ? V2.Count : V1.Count);
double dot = 0.0d;
double mag1 = 0.0d;
double mag2 = 0.0d;
for (int n = 0; n < N; n++)
{
dot += V1[n] * V2[n];
mag1 += Math.Pow(V1[n], 2);
mag2 += Math.Pow(V2[n], 2);
}
return dot / (Math.Sqrt(mag1) * Math.Sqrt(mag2));
}
The last line should be
return (dotX + dotY)/(Math.Sqrt(magX) * Math.Sqrt(magY))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With