Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Nearest neighbor 1 dimensional data with a specified range

I have two nested lists A and B:

A = [[50,140],[51,180],[54,500],......]

B = [[50.1, 170], [51,200],[55,510].....]

The 1st element in each inner list runs from 0 to around 1e5, the 0th element runs from around 50 up to around 700, these elements are unsorted. What i want to do, is run through each element in A[n][1] and find the closest element in B[n][1], but when searching for the nearest neighbor i want to only search within an interval defined by A[n][0] plus or minus 0.5.

I have been using the function:

def find_nearest_vector(array, value): 
   idx = np.array([np.linalg.norm(x+y) for (x,y) in array-value]).argmin()
   return array[idx]

Which finds the nearest neighbor between the coordinates A[0][:]and B[0][:], for example. However, this I need to confine the search range to a rectangle around some small shift in the value A[0][0]. Also, I do not want to reuse elements - I want a unique bijection between each value A[n][1] to B[n][1] within the interval A[n][0] +/- 0.5.

I have been trying to use Scipy's KDTree, but this reuses elements and I don't know how to confine the search range. Effectively, I want to do a one dimensional NNN search on a two dimensional nested list along a specific axis where the neighborhood in which the NNN search is within a hyper-rectangle defined by the 0th element in each inner list plus or minus some small shift.

like image 254
Holtz Avatar asked Sep 15 '13 20:09

Holtz


1 Answers

I use numpy.argsort(), numpy.searchsorted(), numpy.argmin() to do the search.

%pylab inline
import numpy as np
np.random.seed(0)
A = np.random.rand(5, 2)
B = np.random.rand(100, 2)
xaxis_range = 0.02
order = np.argsort(B[:, 0])
bx = B[order, 0]
sidx = np.searchsorted(bx, A[:, 0] - xaxis_range, side="right")
eidx = np.searchsorted(bx, A[:, 0] + xaxis_range, side="left")
result = []
for s, e, ay in zip(sidx, eidx, A[:, 1]):
    section = order[s:e]
    by = B[section, 1]
    idx = np.argmin(np.abs(ay-by))
    result.append(B[section[idx]])
result = np.array(result)

I plot the result as following:

plot(A[:, 0], A[:, 1], "o")
plot(B[:, 0], B[:, 1], ".")
plot(result[:, 0], result[:, 1], "x")

the output:

enter image description here

like image 53
HYRY Avatar answered Oct 20 '22 10:10

HYRY