known_array
: numpy array; consisting of scalar values only; shape: (m, 1)
test_array
: numpy array; consisting of scalar values only; shape: (n, 1)
indices
: numpy array; shape: (n, 1)
; For each value in test_array
finds the index of the closest value in known_array
residual
: numpy array; shape: (n, 1)
; For each value in test_array
finds the difference from the closest value in known_array
In [17]: known_array = np.array([random.randint(-30,30) for i in range(5)])
In [18]: known_array
Out[18]: array([-24, -18, -13, -30, 29])
In [19]: test_array = np.array([random.randint(-10,10) for i in range(10)])
In [20]: test_array
Out[20]: array([-6, 4, -6, 4, 8, -4, 8, -6, 2, 8])
def find_nearest(known_array, value):
idx = (np.abs(known_array - value)).argmin()
diff = known_array[idx] - value
return [idx, -diff]
In [22]: indices = np.zeros(len(test_array))
In [23]: residual = np.zeros(len(test_array))
In [24]: for i in range(len(test_array)):
....: [indices[i], residual[i]] = find_nearest(known_array, test_array[i])
....:
In [25]: indices
Out[25]: array([ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.])
In [26]: residual
Out[26]: array([ 7., 17., 7., 17., 21., 9., 21., 7., 15., 21.])
What is the best way to speed up this task? Cython is an option, but, I would always prefer to be able to remove the for
loop and let the code remain are pure NumPy.
NB: Following Stack Overflow questions were consulted
I did some small benchmarks for comparing the non-vectorized and vectorized solution (accepted answer).
In [48]: [indices1, residual1] = find_nearest_vectorized(known_array, test_array)
In [53]: [indices2, residual2] = find_nearest_non_vectorized(known_array, test_array)
In [54]: indices1==indices2
Out[54]: array([ True, True, True, True, True, True, True, True, True, True], dtype=bool)
In [55]: residual1==residual2
Out[55]: array([ True, True, True, True, True, True, True, True, True, True], dtype=bool)
In [56]: %timeit [indices2, residual2] = find_nearest_non_vectorized(known_array, test_array)
10000 loops, best of 3: 173 µs per loop
In [57]: %timeit [indices1, residual1] = find_nearest_vectorized(known_array, test_array)
100000 loops, best of 3: 16.8 µs per loop
About a 10-fold speedup!
known_array
is not sorted.
I ran the benchmarks as given in the answer by @cyborg below.
Case 1: If known_array
were sorted
known_array = np.arange(0,1000)
test_array = np.random.randint(0, 100, 10000)
print('Speedups:')
base_time = time_f('base')
for func_name in ['diffs', 'searchsorted1', 'searchsorted2']:
print func_name + ' is x%.1f faster than base.' % (base_time / time_f(func_name))
assert np.allclose(base(known_array, test_array), eval(func_name+'(known_array, test_array)'))
Speedups:
diffs is x0.4 faster than base.
searchsorted1 is x81.3 faster than base.
searchsorted2 is x107.6 faster than base.
Firstly, for large arrays diffs
method is actually slower, it also eats up a lot of RAM and my system hanged when I ran it on actual data.
Case 2 : When known_array
is not sorted; which represents actual scenario
known_array = np.random.randint(0,100,100)
test_array = np.random.randint(0, 100, 100)
Speedups:
diffs is x8.9 faster than base.
AssertionError Traceback (most recent call last)
<ipython-input-26-3170078c217a> in <module>()
5 for func_name in ['diffs', 'searchsorted1', 'searchsorted2']:
6 print func_name + ' is x%.1f faster than base.' % (base_time / time_f(func_name))
----> 7 assert np.allclose(base(known_array, test_array), eval(func_name+'(known_array, test_array)'))
AssertionError:
searchsorted1 is x14.8 faster than base.
I must also comment that the approach should also be memory efficient. Otherwise my 8 GB of RAM is not sufficient. In the base case, it is easily sufficient.
If the array is large, you should use searchsorted
:
import numpy as np
np.random.seed(0)
known_array = np.random.rand(1000)
test_array = np.random.rand(400)
%%time
differences = (test_array.reshape(1,-1) - known_array.reshape(-1,1))
indices = np.abs(differences).argmin(axis=0)
residual = np.diagonal(differences[indices,])
output:
CPU times: user 11 ms, sys: 15 ms, total: 26 ms
Wall time: 26.4 ms
searchsorted
version:
%%time
index_sorted = np.argsort(known_array)
known_array_sorted = known_array[index_sorted]
idx1 = np.searchsorted(known_array_sorted, test_array)
idx2 = np.clip(idx1 - 1, 0, len(known_array_sorted)-1)
diff1 = known_array_sorted[idx1] - test_array
diff2 = test_array - known_array_sorted[idx2]
indices2 = index_sorted[np.where(diff1 <= diff2, idx1, idx2)]
residual2 = test_array - known_array[indices]
output:
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 311 µs
We can check that the results is the same:
assert np.all(residual == residual2)
assert np.all(indices == indices2)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With