How to find closest elements in two array?

Tags:

python-2.7

I have two numpy arrays, like X=[x1,x2,x3,x4], y=[y1,y2,y3,y4]. Three of the elements are close and the fourth of them maybe close or not.

Like:

X   [ 84.04467948  52.42447842  39.13555678  21.99846595]
y   [ 78.86529444  52.42447842  38.74910101  21.99846595]

Or it can be:

X   [ 84.04467948  60  52.42447842  39.13555678]
y   [ 78.86529444  52.42447842  38.74910101  21.99846595]

I want to define a function to find the the corresponding index in the two arrays, like in first case:

y[0] correspond to X[0],
y[1] correspond to X[1],
y[2] correspond to X[2],
y[3] correspond to X[3]

And in second case:

y[0] correspond to X[0],
y[1] correspond to X[2],
y[2] correspond to X[3]
and y[3] correspond to X[1].

I can't write a function to solve the problem completely, please help.

816

asked Aug 22 '16 11:08

insomnia

2 Answers

You can start by precomputing the distance matrix as show in this answer:

import numpy as np

X = np.array([84.04467948,60.,52.42447842,39.13555678])
Y = np.array([78.86529444,52.42447842,38.74910101,21.99846595])

dist = np.abs(X[:, np.newaxis] - Y)

Now you can compute the minimums along one axis (I chose 1 corresponding to finding the closest element of Y for every X):

potentialClosest = dist.argmin(axis=1)

This still may contain duplicates (in your case 2). To check for that, you can find find all Y indices that appear in potentialClosest by use of np.unique:

closestFound, closestCounts = np.unique(potentialClosest, return_counts=True)

Now you can check for duplicates by checking if closestFound.shape[0] == X.shape[0]. If so, you're golden and potentialClosest will contain your partners for every element in X. In your case 2 though, one element will occur twice and therefore closestFound will only have X.shape[0]-1 elements whereas closestCounts will not contain only 1s but one 2. For all elements with count 1 the partner is already found. For the two candidates with count 2, though you will have to choose the closer one while the partner of the one with the larger distance will be the one element of Y which is not in closestFound. This can be found as:

missingPartnerIndex = np.where(
        np.in1d(np.arange(Y.shape[0]), closestFound)==False
        )[0][0]

You can do the matchin in a loop (even though there might be some nicer way using numpy). This solution is rather ugly but works. Any suggestions for improvements are very appreciated:

partners = np.empty_like(X, dtype=int)
nonClosePartnerFound = False
for i in np.arange(X.shape[0]):
    if closestCounts[closestFound==potentialClosest[i]][0]==1:
        # A unique partner was found
        partners[i] = potentialClosest[i]
    else:
        # Partner is not unique
        if nonClosePartnerFound:
            partners[i] = potentialClosest[i]
        else:
            if np.argmin(dist[:, potentialClosest[i]]) == i:
                partners[i] = potentialClosest[i]
            else:
                partners[i] = missingPartnerIndex
                nonClosePartnerFound = True
print(partners)

This answer will only work if only one pair is not close. If that is not the case, you will have to define how to find the correct partner for multiple non-close elements. Sadly it's neither a very generic nor a very nice solution, but hopefully you will find it a helpful starting point.

116

answered Sep 21 '22 23:09

jotasi

Using this answer https://stackoverflow.com/a/8929827/3627387 and https://stackoverflow.com/a/12141207/3627387

FIXED

def find_closest(alist, target):
    return min(alist, key=lambda x:abs(x-target))

X = [ 84.04467948,  52.42447842,  39.13555678,  21.99846595]
Y = [ 78.86529444,  52.42447842,  38.74910101,  21.99846595]

def list_matching(list1, list2):
    list1_copy = list1[:]
    pairs = []
    for i, e in enumerate(list2):
        elem = find_closest(list1_copy, e)
        pairs.append([i, list1.index(elem)])
        list1_copy.remove(elem)
    return pairs

answered Sep 17 '22 23:09

Sardorbek Imomaliev

Related questions
                            
                                are elements of an array in a set?
                            
                                How to implement a Global Python Logger?
                            
                                Python/Django date query: Unsupported lookup 'date' for DateField or join on the field not permitted
                            
                                xterm not working in mininet
                            
                                nvcc fatal : Value 'sm_61' is not defined for option 'gpu-architecture' error with theano
                            
                                How to create 2-layers neural network using TensorFlow and python on MNIST data
                            
                                Python's super() , what exactly happens? [duplicate]
                            
                                Python: Generate a geometric progression using list comprehension
                            
                                Reference a dictionary within itself
                            
                                PEP 424 __length_hint__() - Is there a way to do the same for generators or zips?
                            
                                How to binarize the values in a pandas DataFrame?
                            
                                Losing merged cells border while editing Excel file with openpyxl
                            
                                No module named 'django.core.context_processors', in views.py
                            
                                AWS Elastic Beanstalk Environment Variables in Python
                            
                                sphinx-apidoc picks up submodules, but autodoc doesn't document them
                            
                                Sort rows of DataFrame by duplicate
                            
                                Applying numpy.polyfit to xarray Dataset
                            
                                Install python modules in mac osx
                            
                                Copying nested custom objects: alternatives to deepcopy
                            
                                How to convert list [a, b, c] to python slice index[:a, :b :c]?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With