numpy.interp
is very convenient and relatively fast. In certain contexts I'd like to compare its output against a non-interpolated variant where the sparse values are propagated (in the "denser" output) and the result is piecewise constant between the sparse inputs. The function I want could also be called a "sparse -> dense" converter that copies the latest sparse value until it finds a later value (a kind of null interpolation as if zero time/distance has ever elapsed from the earlier value).
Unfortunately, it's not easy to tweak the source for numpy.interp
because it's just a wrapper around a compiled function. I can write this myself using Python loops, but hope to find a C-speed way to solve the problem.
Update: the solution below (scipy.interpolate.interp1d
with kind='zero'
) is quite slow and takes more than 10 seconds per call (e.g. input 500k in length that's 50% populated). It implements kind='zero'
using a zero-order spline and the call to spleval
is very slow. However, the source code for kind='linear'
(i.e. default interpolation) gives an excellent template for solving the problem using straight numpy (minimal change is to set slope=0
). That code shows how to use numpy.searchsorted
to solve the problem and the runtime is similar to calling numpy.interp
, so problem is solved by tweaking the scipy.interpolate.interp1d
implementation of linear interpolation to just skip the interpolation step (slope != 0 blends the adjacent values).
interp() function returns the one-dimensional piecewise linear interpolant to a function with given discrete data points (xp, fp), evaluated at x. Parameters : x : [array_like] The x-coordinates at which to evaluate the interpolated values.
With numpy. interp it is possible to obtain a good approximation from original data. Besides that, it is possible to see that this function don't extrapolate values above data limits by default.
Interpolation is a method for generating points between given points. For example: for points 1 and 2, we may interpolate and find points 1.33 and 1.66. Interpolation has many usage, in Machine Learning we often deal with missing data in a dataset, interpolation is often used to substitute those values.
The scipy.interpolate.interp1d
can do all kinds of interpolation: ‘linear’,’nearest’, ‘zero’, ‘slinear’, ‘quadratic, ‘cubic’.
Please check the document: http://docs.scipy.org/doc/scipy-0.10.1/reference/generated/scipy.interpolate.interp1d.html#scipy.interpolate.interp1d
Just for completion: The solution to the question is the following code which I was able to write with the help of the hints given in the updated answer:
def interpolate_constant(x, xp, yp):
indices = np.searchsorted(xp, x, side='right')
y = np.concatenate(([0], yp))
return y[indices]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With