I have two pandas dataframes d1 and d2 that look like these:
d1 looks like:
output value1 value2 value2
1 100 103 87
1 201 97.5 88.9
1 144 54 85
d2 looks like:
output value1 value2 value2
0 100 103 87
0 201 97.5 88.9
0 144 54 85
0 100 103 87
0 201 97.5 88.9
0 144 54 85
The column output has a value of 1 for all rows in d1 and 0 for all rows in d2. It's a grouping variable. I need to find euclidean distance between each rows of d1 and d2 (not within d1 or d2). If d1 has m rows and d2 has n rows, then the distance matrix will have m rows and n columns
By using scipy.spatial.distance.cdist:
from scipy.spatial.distance import cdist
ary = cdist(d1.iloc[:,1:], d2.iloc[:,1:], metric='euclidean')
pd.DataFrame(ary)
Out[1274]:
0 1 2 3 4 5
0 0.000000 101.167485 65.886266 0.000000 101.167485 65.886266
1 101.167485 0.000000 71.808495 101.167485 0.000000 71.808495
2 65.886266 71.808495 0.000000 65.886266 71.808495 0.000000
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With