Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find euclidean distance from a point to rows in pandas dataframe

i have a dataframe

id    lat      long
1     12.654   15.50
2     14.364   25.51
3     17.636   32.53
5     12.334   25.84
9     32.224   15.74

I want to find the euclidean distance of these coordinates from a particulat location saved in a list L1

L1 = [11.344,7.234]

i want to create a new column in df where i have the distances

id     lat     long    distance
1     12.654   15.50
2     14.364   25.51
3     17.636   32.53
5     12.334   25.84
9     32.224   15.74

i know to find euclidean distance between two points using math.hypot():

dist = math.hypot(x2 - x1, y2 - y1)

How do i write a function using apply or iterate over rows to give me distances.

like image 332
Shubham R Avatar asked Oct 24 '17 10:10

Shubham R


People also ask

How do you find the Euclidean distance between two points in Python?

dist() method returns the Euclidean distance between two points (p and q), where p and q are the coordinates of that point. Note: The two points (p and q) must be of the same dimensions.

How do you find the Euclidean distance between two data points?

Determine the Euclidean distance between two points (a, b) and (-a, -b). d = 2√(a2+b2). Hence, the distance between two points (a, b) and (-a, -b) is 2√(a2+b2).


2 Answers

Use vectorized approach

In [5463]: (df[['lat', 'long']] - np.array(L1)).pow(2).sum(1).pow(0.5)
Out[5463]:
0     8.369161
1    18.523838
2    26.066777
3    18.632320
4    22.546096
dtype: float64

Which can also be

In [5468]: df['distance'] = df[['lat', 'long']].sub(np.array(L1)).pow(2).sum(1).pow(0.5)

In [5469]: df
Out[5469]:
   id     lat   long   distance
0   1  12.654  15.50   8.369161
1   2  14.364  25.51  18.523838
2   3  17.636  32.53  26.066777
3   5  12.334  25.84  18.632320
4   9  32.224  15.74  22.546096

Option 2 Use Numpy's built-in np.linalg.norm vector norm.

In [5473]: np.linalg.norm(df[['lat', 'long']].sub(np.array(L1)), axis=1)
Out[5473]: array([  8.36916101,  18.52383805,  26.06677732,  18.63231966,   22.5460958 ])

In [5485]: df['distance'] = np.linalg.norm(df[['lat', 'long']].sub(np.array(L1)), axis=1)
like image 175
Zero Avatar answered Sep 22 '22 13:09

Zero


Translating [(x2 - x1)2 + (y2 - y1)2]1/2 into pandas vectorised operations, you have:

df['distance'] = (df.lat.sub(11.344).pow(2).add(df.long.sub(7.234).pow(2))).pow(.5)         
df

       lat   long   distance
id                          
1   12.654  15.50   8.369161
2   14.364  25.51  18.523838
3   17.636  32.53  26.066777
5   12.334  25.84  18.632320
9   32.224  15.74  22.546096

Alternatively, using arithmetic operators:

(((df.lat - 11.344) ** 2) + (df.long - 7.234) ** 2) ** .5
like image 34
cs95 Avatar answered Sep 19 '22 13:09

cs95