Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Euclidean distance between two pandas dataframes

I have two dataframes:

df1 of the form

user_id  | x_coord  | y_coord
 214         -55.2      22.1
 214         -55.2      22.1
 214         -55.2      22.1
...

and df2, of the same form, but with different users:

user_id  | x_coord  | y_coord
 512         -15.2      19.1
 362          65.1      71.4
 989         -84.8      13.7
...

The idea is that I want to find the Euclidean distance between the user in df1 and all the users in df2. For this, I need to be able to compute the Euclidean distance between the two dataframes, based on the last two column, in order to find out which are the closest users in the second dataframe to user 214.

I found this answer but it is not what I need, as my two dataframes have equal shapes and I need the distance computed in a per-row manner:

Euclidean_Distance_i(row_i_df1, row_i_df2)

and save all these distances in a list that is the same length as these dataframes.

like image 756
Qubix Avatar asked Jan 27 '23 07:01

Qubix


1 Answers

Try:

def Euclidean_Dist(df1, df2, cols=['x_coord','y_coord']):
    return np.linalg.norm(df1[cols].values - df2[cols].values,
                   axis=1)

Test:

df1 = pd.DataFrame({'user_id':[214,214,214],
                'x_coord':[-55.2,-55.2,-55.2],
                'y_coord':[22.1,22.1,22.1]})

df2 = pd.DataFrame({'user_id':[512, 362, 989],
                    'x_coord':[-15.2, 65.1, -84.8],
                    'y_coord':[19.1, 71.4, 13.7]})

Euclidean_Dist(df1, df2)

outputs:

array([ 40.11234224, 130.0099227 ,  30.76881538])
like image 53
Quang Hoang Avatar answered Jan 30 '23 21:01

Quang Hoang