Cosine similarity between each row in a Dataframe in Python

Tags:

I have a DataFrame containing multiple vectors each having 3 entries. Each row is a vector in my representation. I needed to calculate the cosine similarity between each of these vectors. Converting this to a matrix representation is better or is there a cleaner approach in DataFrame itself?

Here is the code that I have tried.

import pandas as pd
from scipy import spatial
df = pd.DataFrame([X,Y,Z]).T
similarities = df.values.tolist()

for x in similarities:
    for y in similarities:
        result = 1 - spatial.distance.cosine(x, y)

582

asked Jul 29 '17 09:07

Jayanth Prakash Kulkarni

1 Answers

You can directly just use sklearn.metrics.pairwise.cosine_similarity.

Demo

import numpy as np; import pandas as pd
from sklearn.metrics.pairwise import cosine_similarity

df = pd.DataFrame(np.random.randint(0, 2, (3, 5)))

df
##     0  1  2  3  4
##  0  1  1  1  0  0
##  1  0  0  1  1  1
##  2  0  1  0  1  0

cosine_similarity(df)
##  array([[ 1.        ,  0.33333333,  0.40824829],
##         [ 0.33333333,  1.        ,  0.40824829],
##         [ 0.40824829,  0.40824829,  1.        ]])

answered Oct 11 '22 16:10

miradulo

Related questions
                            
                                Controlling alpha value on 3D scatter plot using Python and matplotlib
                            
                                What value do I use in a slicing range to include the last value in a numpy array?
                            
                                python tornado get request url
                            
                                Python: Inheritance versus Composition
                            
                                Extending python with C: Pass a list to PyArg_ParseTuple
                            
                                How does one insert a key value pair into a python list?
                            
                                sys.stdin.readline() and input(): which one is faster when reading lines of input, and why?
                            
                                Create hash value for each row of data with selected columns in dataframe in python pandas
                            
                                How do you configure Django to send mail through Postfix? [closed]
                            
                                How do I dissolve a pattern in a numpy array?
                            
                                how to split a dataset into training and validation set keeping ratio between classes?
                            
                                How to change the range of the x-axis and y-axis in matlibplot?
                            
                                Django rest framework: override create() in ModelSerializer passing an extra parameter
                            
                                Error handling in Python-MySQL
                            
                                How to explore a decision tree built using scikit learn
                            
                                Binding list to params in Pandas read_sql_query with other params
                            
                                No schema has been selected to create in ... error
                            
                                Python Selenium - Wait until next page has loaded after form submit
                            
                                TypeError: the JSON object must be str, not 'dict'
                            
                                TensorFlow TypeError: Value passed to parameter input has DataType uint8 not in list of allowed values: float16, float32

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Cosine similarity between each row in a Dataframe in Python

Tags:

python

pandas

dataframe

scikit-learn

Jayanth Prakash Kulkarni

People also ask

1 Answers

miradulo

Recent Activity

Donate For Us