Let's say I have a NumPy array:
x = np.array([0, 1, 2, 0, 4, 5, 6, 7, 0, 0])
At each index, I want to find the distance to nearest zero value. If the position is a zero itself then return zero as a distance. Afterward, we are only interested in distances to the nearest zero that is to the right of the current position. The super naive approach would be something like:
out = np.full(x.shape[0], x.shape[0]-1)
for i in range(x.shape[0]):
j = 0
while i + j < x.shape[0]:
if x[i+j] == 0:
break
j += 1
out[i] = j
And the output would be:
array([0, 2, 1, 0, 4, 3, 2, 1, 0, 0])
I'm noticing a countdown/decrement pattern in the output in between the zeros. So, I might be able to do use the locations of the zeros (i.e., zero_indices = np.argwhere(x == 0).flatten()
)
What is the fastest way to get the desired output in linear time?
Use the NumPy Module to Find the Euclidean Distance Between Two Points. The numpy module can be used to find the required distance when the coordinates are in the form of an array. It has the norm() function, which can return the vector norm of an array.
Approach #1 : Searchsorted
to the rescue for linear-time in a vectorized manner (before numba guys come in)!
mask_z = x==0
idx_z = np.flatnonzero(mask_z)
idx_nz = np.flatnonzero(~mask_z)
# Cover for the case when there's no 0 left to the right
# (for same results as with posted loop-based solution)
if x[-1]!=0:
idx_z = np.r_[idx_z,len(x)]
out = np.zeros(len(x), dtype=int)
idx = np.searchsorted(idx_z, idx_nz)
out[~mask_z] = idx_z[idx] - idx_nz
Approach #2 : Another with some cumsum
-
mask_z = x==0
idx_z = np.flatnonzero(mask_z)
# Cover for the case when there's no 0 left to the right
if x[-1]!=0:
idx_z = np.r_[idx_z,len(x)]
out = idx_z[np.r_[False,mask_z[:-1]].cumsum()] - np.arange(len(x))
Alternatively, last step of cumsum
could be replaced by repeat
functionality -
r = np.r_[idx_z[0]+1,np.diff(idx_z)]
out = np.repeat(idx_z,r)[:len(x)] - np.arange(len(x))
Approach #3 : Another with mostly just cumsum
-
mask_z = x==0
idx_z = np.flatnonzero(mask_z)
pp = np.full(len(x), -1)
pp[idx_z[:-1]] = np.diff(idx_z) - 1
if idx_z[0]==0:
pp[0] = idx_z[1]
else:
pp[0] = idx_z[0]
out = pp.cumsum()
# Handle boundary case and assigns 0s at original 0s places
out[idx_z[-1]:] = np.arange(len(x)-idx_z[-1],0,-1)
out[mask_z] = 0
You could work from the other side. Keep a counter on how many non zero digits have passed and assign it to the element in the array. If you see 0, reset the counter to 0
Edit: if there is no zero on the right, then you need another check
x = np.array([0, 1, 2, 0, 4, 5, 6, 7, 0, 0])
out = x
count = 0
hasZero = False
for i in range(x.shape[0]-1,-1,-1):
if out[i] != 0:
if not hasZero:
out[i] = x.shape[0]-1
else:
count += 1
out[i] = count
else:
hasZero = True
count = 0
print(out)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With