Python pandas: Finding cosine similarity of two columns

Suppose I have two columns in a python pandas.DataFrame:

          col1 col2
item_1    158  173
item_2     25  191
item_3    180   33
item_4    152  165
item_5     96  108

What's the best way to take the cosine similarity of these two columns?

How do you find the cosine similarity between two columns in Python?

First, you concatenate 2 columns of interest into a new data frame. Then you drop NaN. After that those 2 columns have only corresponding rows, and you can compare them with cosine distance or any other pairwise distance you wish.

How do you find cosine similarity in Python?

We use the below formula to compute the cosine similarity. where A and B are vectors: A.B is dot product of A and B: It is computed as sum of element-wise product of A and B. ||A|| is L2 norm of A: It is computed as square root of the sum of squares of elements of the vector A.

How do you find the difference between two columns in pandas?

Difference between rows or columns of a pandas DataFrame object is found using the diff() method. The axis parameter decides whether difference to be calculated is between rows or between columns. When the periods parameter assumes positive values, difference is found by subtracting the previous row from the next row.

Is that what you're looking for?

from scipy.spatial.distance import cosine
from pandas import DataFrame


df = DataFrame({"col1": [158, 25, 180, 152, 96],
                "col2": [173, 191, 33, 165, 108]})

print(1 - cosine(df["col1"], df["col2"]))

You can also use cosine_similarity or other similarity metrics from sklearn.metrics.pairwise.

from sklearn.metrics.pairwise import cosine_similarity

cosine_similarity(df.col1, df.col2)
Out[4]: array([[0.7498213]])

Python pandas: Finding cosine similarity of two columns

Tags:

python

pandas

dataframe

cosine-similarity

hlin117

People also ask

2 Answers

xbello

Amir Imani

Recent Activity

Donate For Us

Python pandas: Finding cosine similarity of two columns

Tags:

python

pandas

dataframe

cosine-similarity

hlin117

People also ask

2 Answers

xbello

Amir Imani

Related questions

Recent Activity

Donate For Us