I have a numpy array with zeros like this.
a = np.array([3., 0., 2., 3., 0., 3., 3., 3., 0., 3., 3., 0., 3., 0., 0., 0., 0.,
3., 3., 0., 3., 3., 0., 3., 0., 3., 0., 0., 0., 3., 0., 3., 3., 0.,
3., 3., 0., 0., 3., 0., 0., 0., 3., 0., 3., 3., 3., 3., 3., 3., 3.,
3., 3., 3., 3., 3., 3., 4., 3., 0., 3., 3., 3., 3., 3., 3., 3., 0.,
0., 0., 0., 3., 0., 0., 3., 0., 0., 0., 3., 3., 3., 3., 3., 3., 3.,
3., 0., 3., 3., 3., 3., 3., 0., 3., 3., 3., 3., 0., 0., 0., 3., 3.,
3., 0., 3., 3., 3., 5., 3., 3., 3., 3., 3., 3., 3., 0., 3., 0., 3.,
3., 0., 0., 0., 3., 3., 3., 3., 0., 3., 3., 3., 3., 3., 3., 3., 3.,
3., 3., 3., 3., 0., 3., 3., 3., 3., 3., 3., 0., 3., 3., 3., 3., 3.,
3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 0., 3., 0., 3.,
3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 0., 3., 3., 3., 3.,
3., 3., 3., 3., 3., 3., 3., 3., 0., 3., 3., 0., 0., 3., 0., 0., 3.,
0., 3., 3., 0., 3., 3., 0., 0., 3., 3., 3., 3., 3., 3., 3., 0., 3.,
3., 3., 3., 3.])
I need to replace zeros with previous value (forward fill) under a condition.If number of zeros between two non zero numbers is less than or equal to 2, need to forward fill the zero.
As a example,
1) If I consider 3., 0., 2. these three numbers,number of zeros between non zero numbers is 1.This should fill with 3.
2) If I consider 3., 0., 0., 0., 0.,3., 3. these numbers,number of zeros between 3 is greater than 2.so it will keep as it is.
In these cases where coming up with a purely vectorised approach does not seem trivial (to say the least in this case), we can go with numba to compile your code down to C-level. Here's one way using numba's nopython mode:
import numba
@numba.njit('int64[:](int64[:],uintc)') #change accordingly
def conditional_ffill(a, w):
c=0
last_non_zero = a[0]
out = np.copy(a)
for i in range(len(a)):
if a[i]==0:
c+=1
elif c>0 and c<w:
out[i-c:i] = last_non_zero
c=0
last_non_zero=a[i]
return out
Checking on divakar's test array:
a = np.array([2, 0, 3, 0, 0, 4, 0, 0, 0, 5, 0])
conditional_ffill(a, w=1)
# array([2, 0, 3, 0, 0, 4, 0, 0, 0, 5, 0])
conditional_ffill(a, w=2)
# array([2, 2, 3, 0, 0, 4, 0, 0, 0, 5, 0])
conditional_ffill(a, w=3)
# array([2, 2, 3, 3, 3, 4, 0, 0, 0, 5, 0])
conditional_ffill(a, w=4)
# array([2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 0])
Timings on a larger array:
a_large = np.tile(a, 10000)
%timeit ffill_windowed(a_large, 3)
# 1.39 ms ± 68.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit conditional_ffill(a_large, 3)
# 150 µs ± 862 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Here's one approach with that window of forward filling as a parameter to handle generic cases -
# https://stackoverflow.com/a/33893692/ @Divakar
def numpy_binary_closing(mask,W):
# Define kernel
K = np.ones(W)
# Perform dilation and threshold at 1
dil = np.convolve(mask,K)>=1
# Perform erosion on the dilated mask array and threshold at given threshold
dil_erd = np.convolve(dil,K)>= W
return dil_erd[W-1:-W+1]
def ffill_windowed(a, W):
mask = a!=0
mask_ext = numpy_binary_closing(mask,W)
p = mask_ext & ~mask
idx = np.maximum.accumulate(mask*np.arange(len(mask)))
out = a.copy()
out[p] = out[idx[p]]
return out
Explanation : The first part does binary-closing operation that's well explored in image-processing domain. So, in our case, we will start off with a mask of non-zeros and image-close based on the window parameter. We get, the indices at all those places where we need to fill by getting forward-filled indices, explored in this post. We put in new values based on the closed-in mask obtained earlier. That's all there is!
Sample runs -
In [142]: a
Out[142]: array([2, 0, 3, 0, 0, 4, 0, 0, 0, 5, 0])
In [143]: ffill_windowed(a, W=2)
Out[143]: array([2, 2, 3, 0, 0, 4, 0, 0, 0, 5, 0])
In [144]: ffill_windowed(a, W=3)
Out[144]: array([2, 2, 3, 3, 3, 4, 0, 0, 0, 5, 0])
In [146]: ffill_windowed(a, W=4)
Out[146]: array([2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 0])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With