Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Round each number of list to most near number in another list

Suppose I have a certain list x with numbers, and another list y with other numbers. Elements of y should be elements of x, but due to noise in measurements, they are kind of different. I want to find, for each value of y, the value of x that is the nearest to it.

I can do this with some loops and check, for each element y[i], which element x[j] minimizes abs(x[j]-y[i]), but I'm pretty sure there is a much easier, cleaner way to do this. The lists could be huge so I'm looking for efficient code here.

The code I've written so far is:

x_in = [1.1, 2.2, 3, 4, 6.2]
y_in = [0.9, 2, 1.9, 6, 5, 6, 6.2, 0.5, 0, 3.1]
desired_output = [1.1, 2.2, 2.2, 6.2, 4, 6.2, 6.2, 1.1, 1.1, 3]

y_out = []

for y in y_in:
    aux = [abs(l - y) for l in x_in]
    mn,idx = min( (aux[i],i) for i in range(len(aux)) )
    y_out.append(x_in[idx])

>>> y_out == desired_output
True

But I don't know if there is a more efficient way to do this...

EDIT:

Due to my ignorance, I forgot to clarify something that may be of relevance according to the comments I've recieved.

  • The x list is sorted.
  • x is the only list that can have a pretty big size: between 500,000 and 1,000,000 elements, in general. y will in general be really small, less than 10 elements.
like image 904
Tendero Avatar asked Jul 18 '18 20:07

Tendero


2 Answers

Given that x is sorted, the most efficient way to do this is using bisect to search for the closest value. Just create a list of mid points between the x values and run bisect on those:

In [69]: mid_points = [(x1+x2)/2 for x1, x2 in zip(x[1:], x[:-1])]

In [70]: mid_points
Out[70]: [1.5, 2.5, 3.5, 4.5]

In [72]: [x[bisect.bisect(mid_points, v)] for v in y]
Out[72]: [1, 1, 4, 5, 2]

This will run in O(Mlog(N)+N) time where `M=len(y), N=len(x)

(For python2 do from __future__ import division or use float(x1+x2)/2 in the mid_points calculation)

like image 163
kuppern87 Avatar answered Oct 24 '22 07:10

kuppern87


You can do this quickly with a lambda function and list comprehension:

[min(x, key=lambda x:abs(x-a)) for a in y]

This will work with floats, integers, etc.

like image 30
dpwilson Avatar answered Oct 24 '22 05:10

dpwilson