Fast interpolation of one array axis

Question

I want to interpolate one axis of data inside a 3-dimensional array. The given x-values for the different vales differ slightly but they should all be mapped to the same x-values.

Since the given x-values are not identical, currently I do the following:

import numpy as np
from scipy import interpolate

axes_have = np.ones( ( 2, 72, 2001 ) )
axes_have *= np.linspace( 0, 100, 2001 )[None,None,:]
axes_have += np.linspace( -0.3, 0.3, 144 ).reshape( ( 2, 72 ) )[:,:,None]

arr = np.sin( axes_have )
arr *= np.random.random( (2, 72 ) )[:,:,None]

axis_want = np.linspace( 0, 100, 201 )    

arr_ip = np.zeros( ( 2, 72, 201 ) )
for i in range( arr.shape[0] ):
    for j in range( arr.shape[1] ):
         ip_func = interpolate.PchipInterpolator( axes_have[i,j,:], arr[i,j,:], extrapolate=True )
         arr_ip[i,j,:] = ip_func( axis_want )

Using two nested for-loops is unsurprisingly very slow.

Is there a way to improve the speed? Maybe by doing some NumPy array magic or parallelization.

Saedeas · Accepted Answer

I've done some initial testing and it seems as though the vast bulk of your time is spent generating the actual interpolation functions. It doesn't seem like just vectorization will speed it up a ton, but it does make it easy to parallelize (which does yield speed increases). Here's an example.

import numpy as np
from scipy import interpolate
import timeit
import multiprocessing



def myfunc(arg):
    x, y = arg
    return interpolate.PchipInterpolator(x,
                                         y,
                                         extrapolate=True)

p = multiprocessing.Pool(processes=8)
axes_have = np.ones((2, 72, 2001))
axes_have *= np.linspace(0, 100, 2001)[None, None, :]
axes_have += np.linspace(-0.3, 0.3, 144).reshape((2, 72))[:, :, None]

arr = np.sin(axes_have)
arr *= np.random.random((2, 72))[:, :, None]

axis_want = np.linspace(0, 100, 201)

arr_ip = np.zeros((2, 72, 201))
s_time1 = timeit.default_timer()
for i in range(arr.shape[0]):
    for j in range(arr.shape[1]):
         ip_func = interpolate.PchipInterpolator(axes_have[i, j, :],
                                                 arr[i, j, :],
                                                 extrapolate=True)
         arr_ip[i, j, :] = ip_func(axis_want)
elapsed1 = timeit.default_timer() - s_time1

s_time2 = timeit.default_timer()
flatten_have = [y for x in axes_have for y in x]
flatten_arr = [y for x in arr for y in x]
interp_time = timeit.default_timer()
funcs = p.map(myfunc, zip(flatten_have, flatten_arr))
interp_elapsed = timeit.default_timer() - interp_time
arr_ip = np.asarray([func(axis_want) for func in funcs]).reshape(2, 72, 201)
elapsed2 = timeit.default_timer() - s_time2

print("Elapsed 1: {}".format(elapsed1))
print("Elapsed 2: {}".format(elapsed2))
print("Elapsed interp: {}".format(interp_elapsed))

Typical Results (note that the vectorized implementation is pretty much exactly the same speed without the parallelization and that I have 2 cores, so your runtime should be roughly (original time / # of cores)):

Elapsed 1: 10.4025919437
Elapsed 2: 5.11409401894
Elapsed interp: 5.10987687111

Don't get me wrong, there may be an algorithmic way to do this more quickly, but this seems like the easiest way to yield an immediate speedup, since the problem is embarrassingly parallel.

Fast interpolation of one array axis

Tags:

python

arrays

numpy

interpolation

leviathan

1 Answers

Saedeas

Recent Activity

Donate For Us

Fast interpolation of one array axis

Tags:

python

arrays

numpy

interpolation

leviathan

1 Answers

Saedeas

Related questions

Recent Activity

Donate For Us