Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: finding points within maximum distance

I am trying to find pairs of (x,y) points within a maximum distance of each other. I thought the simplest thing to do would be to generate a DataFrame and go through each point, one by one, calculating if there are points with coordinates (x,y) within distance r of the given point (x_0, y_0). Then, divide the total number of discovered pairs by 2.

%pylab inline
import pandas as pd

def find_nbrs(low, high, num, max_d):
    x = random.uniform(low, high, num)
    y = random.uniform(low, high, num)
    points = pd.DataFrame({'x':x, 'y':y})

    tot_nbrs = 0

    for i in arange(len(points)):
        x_0 = points.x[i]
        y_0 = points.y[i]

        pt_nbrz = points[((x_0 - points.x)**2 + (y_0 - points.y)**2) < max_d**2]
        tot_nbrs += len(pt_nbrz)
        plot (pt_nbrz.x, pt_nbrz.y, 'r-')

    plot (points.x, points.y, 'b.')
    return tot_nbrs

print find_nbrs(0, 1, 50, 0.1)
  1. First of all, it's not always finding the right pairs (I see points that are within the stated distance that are not labeled).

  2. If I write plot(..., 'or'), it highlights all the points. Which means that pt_nbrz = points[((x_0 - points.x)**2 + (y_0 - points.y)**2) < max_d**2] returns at least one (x,y). Why? Shouldn't it return an empty array if the comparison is False?

  3. How do I do all of the above more elegantly in Pandas? For example, without having to loop through each element.

like image 484
Anarcho-Chossid Avatar asked Dec 09 '25 07:12

Anarcho-Chossid


1 Answers

The functionality you're looking for is included in scipy's spatial distance module.

Here's an example of how you could use it. The real magic is in squareform(pdist(points)).

from scipy.spatial.distance import pdist, squareform
import numpy as np
import matplotlib.pyplot as plt

points = np.random.uniform(-.5, .5, (1000,2))

# Compute the distance between each different pair of points in X with pdist.
# Then, just for ease of working, convert to a typical symmetric distance matrix
# with squareform.
dists = squareform(pdist(points))

poi = points[4] # point of interest
dist_min = .1
close_points = dists[4] < dist_min

print("There are {} other points within a distance of {} from the point "
    "({:.3f}, {:.3f})".format(close_points.sum() - 1, dist_min, *poi))

There are 27 other points within a distance of 0.1 from the point (0.194, 0.160)

For visualization purposes:

f,ax = plt.subplots(subplot_kw=
    dict(aspect='equal', xlim=(-.5, .5), ylim=(-.5, .5)))
ax.plot(points[:,0], points[:,1], 'b+ ')
ax.plot(poi[0], poi[1], ms=15, marker='s', mfc='none', mec='g')
ax.plot(points[close_points,0], points[close_points,1],
    marker='o', mfc='none', mec='r', ls='')  # draw all points within distance

t = np.linspace(0, 2*np.pi, 512)
circle = dist_min*np.vstack([np.cos(t), np.sin(t)]).T
ax.plot((circle+poi)[:,0], (circle+poi)[:,1], 'k:') # Add a visual check for that distance
plt.show()

enter image description here

like image 195
Oliver W. Avatar answered Dec 11 '25 20:12

Oliver W.



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!