replace zeros in numpy array with linear interpolation between its preceding and succeeding values

Question

assuming that we have an array a = np.array([1,2,0,4,0,5,0,0,11]) ,how can we get:

array([ 1,  2,  3,  4,  4.5,  5,  7,  9, 11])

What I have tried is:

from scipy.interpolate import interp1d

a = np.array([1,2,0,4,0,5,0,0,11])
b = a[np.nonzero(a)]
brange = np.arange(b.shape[0])
interp = interp1d(brange, b)

This seems to do the actual job of finding in-between values. For instance:

print (interp(1), interp(1.5), interp(2), interp(2.5), interp(3))
#out: 2.0 3.0 4.0 4.5 5.0

But I can't figure out how to re-construct my original array from interp. I also tried the solution to this question, but I had the exact same problem with that solution as well.

UPDATE:

I did a quick benchmark for both solution using numpy and pandas, here is the result:

y = np.array([1,2,0,4,0,5,0,0,11])

def test1(y):

    x = np.arange(len(y))
    idx = np.nonzero(y)
    interp = interp1d(x[idx],y[idx])

    return interp(x)

def test2(y):
    s = pd.Series(y)
    s.interpolate(inplace=True)
    return s.values

%timeit t1 = test1(y)
%timeit t2 = test2(y)

139 µs ± 1.62 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
158 µs ± 2.01 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

About 12% faster. Not as good as I hoped, but since the code is going to be run several million times, it probably worth the effort.

Thomas Kühn · Accepted Answer

You need to feed interp1d a y-array without the zeros and an x-array that skips said zeros. Then, for the interpolation, you have to give the interpolation function an x-array that holds all the original x-values plus the ones at which you want your interpolated values to occur. In your case, as you have a ready, equally spaced vector, you can just use np.arange to produce the x-values and np.where for filtering out the zeros.

Here an example code:

import numpy as np
from scipy.interpolate import interp1d

y = np.array([1,2,0,4,0,5,0,0,11])
xnew = np.arange(len(y))

zero_idx = np.where(y==0)
xold = np.delete(xnew,zero_idx)
yold = np.delete(y, zero_idx)

print('before')
print(xold)
print(yold)

f = interp1d(xold,yold)

ynew = f(xnew)

print()
print('after')
print(xnew)
print(ynew)

The result looks like this:

before
[0 1 3 5 8]
[ 1  2  4  5 11]

after
[0 1 2 3 4 5 6 7 8]
[  1.    2.    3.    4.    4.5   5.    7.    9.   11. ]

EDIT:

Actually you don't need the np.delete, you can just use slicing:

y = np.array([1,2,0,4,0,5,0,0,11])
x = np.arange(len(y))
idx = np.where(y!=0)        #or np.nonzero(y) -- thanks DanielF
f = interp1d(x[idx],y[idx])
ynew = f(x)

replace zeros in numpy array with linear interpolation between its preceding and succeeding values

Tags:

python

benchmarking

numpy

scipy

interpolation

Alz

1 Answers

Thomas Kühn

Recent Activity

Donate For Us

replace zeros in numpy array with linear interpolation between its preceding and succeeding values

Tags:

python

benchmarking

numpy

scipy

interpolation

Alz

1 Answers

Thomas Kühn

Related questions

Recent Activity

Donate For Us