Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find Distance to Nearest Zero in NumPy Array

Tags:

python

numpy

Let's say I have a NumPy array:

x = np.array([0, 1, 2, 0, 4, 5, 6, 7, 0, 0])

At each index, I want to find the distance to nearest zero value. If the position is a zero itself then return zero as a distance. Afterward, we are only interested in distances to the nearest zero that is to the right of the current position. The super naive approach would be something like:

out = np.full(x.shape[0], x.shape[0]-1)
for i in range(x.shape[0]):
    j = 0
    while i + j < x.shape[0]:
        if x[i+j] == 0:
            break
        j += 1
    out[i] = j

And the output would be:

array([0, 2, 1, 0, 4, 3, 2, 1, 0, 0])

I'm noticing a countdown/decrement pattern in the output in between the zeros. So, I might be able to do use the locations of the zeros (i.e., zero_indices = np.argwhere(x == 0).flatten())

What is the fastest way to get the desired output in linear time?

like image 986
slaw Avatar asked Mar 05 '20 21:03

slaw


People also ask

How do you find the distance between two numpy arrays?

Use the NumPy Module to Find the Euclidean Distance Between Two Points. The numpy module can be used to find the required distance when the coordinates are in the form of an array. It has the norm() function, which can return the vector norm of an array.


2 Answers

Approach #1 : Searchsorted to the rescue for linear-time in a vectorized manner (before numba guys come in)!

mask_z = x==0
idx_z = np.flatnonzero(mask_z)
idx_nz = np.flatnonzero(~mask_z)

# Cover for the case when there's no 0 left to the right
# (for same results as with posted loop-based solution)
if x[-1]!=0:
    idx_z = np.r_[idx_z,len(x)]

out = np.zeros(len(x), dtype=int)
idx = np.searchsorted(idx_z, idx_nz)
out[~mask_z] = idx_z[idx] - idx_nz

Approach #2 : Another with some cumsum -

mask_z = x==0
idx_z = np.flatnonzero(mask_z)

# Cover for the case when there's no 0 left to the right
if x[-1]!=0:
    idx_z = np.r_[idx_z,len(x)]

out = idx_z[np.r_[False,mask_z[:-1]].cumsum()] - np.arange(len(x))

Alternatively, last step of cumsum could be replaced by repeat functionality -

r = np.r_[idx_z[0]+1,np.diff(idx_z)]
out = np.repeat(idx_z,r)[:len(x)] - np.arange(len(x))

Approach #3 : Another with mostly just cumsum -

mask_z = x==0
idx_z = np.flatnonzero(mask_z)

pp = np.full(len(x), -1)
pp[idx_z[:-1]] = np.diff(idx_z) - 1
if idx_z[0]==0:
    pp[0] = idx_z[1]
else:
    pp[0] = idx_z[0]
out = pp.cumsum()

# Handle boundary case and assigns 0s at original 0s places
out[idx_z[-1]:] = np.arange(len(x)-idx_z[-1],0,-1)
out[mask_z] = 0
like image 179
Divakar Avatar answered Nov 23 '22 08:11

Divakar


You could work from the other side. Keep a counter on how many non zero digits have passed and assign it to the element in the array. If you see 0, reset the counter to 0

Edit: if there is no zero on the right, then you need another check

x = np.array([0, 1, 2, 0, 4, 5, 6, 7, 0, 0])
out = x 
count = 0 
hasZero = False 
for i in range(x.shape[0]-1,-1,-1):
    if out[i] != 0:
        if not hasZero: 
            out[i] = x.shape[0]-1
        else:
            count += 1
            out[i] = count
    else:
        hasZero = True
        count = 0
print(out)
like image 20
MT756 Avatar answered Nov 23 '22 08:11

MT756