Suppose I have a numpy array of the form:
arr=numpy.array([[1,1,0],[1,1,0],[0,0,1],[0,0,0]])
I want to find the indices of the first index (for every column) where the value is non-zero.
So in this instance, I would like the following to be returned:
[0,0,2]
How do I go about this?
Use the nonzero() Function to Find the First Index of an Element in a NumPy Array. The nonzero() function returns the indices of all the non-zero elements in a numpy array.
nonzero() function is used to Compute the indices of the elements that are non-zero. It returns a tuple of arrays, one for each dimension of arr, containing the indices of the non-zero elements in that dimension. The corresponding non-zero values in the array can be obtained with arr[nonzero(arr)] .
You can access an array element by referring to its index number. The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1 etc.
Use np.argmax
along that axis (zeroth axis for columns here) on the mask of non-zeros to get the indices of first matches
(True values) -
(arr!=0).argmax(axis=0)
Extending to cover generic axis specifier and for cases where no non-zeros are found along that axis for an element, we would have an implementation like so -
def first_nonzero(arr, axis, invalid_val=-1): mask = arr!=0 return np.where(mask.any(axis=axis), mask.argmax(axis=axis), invalid_val)
Note that since argmax()
on all False
values returns 0
, so if the invalid_val
needed is 0
, we would have the final output directly with mask.argmax(axis=axis)
.
Sample runs -
In [296]: arr # Different from given sample for variety Out[296]: array([[1, 0, 0], [1, 1, 0], [0, 1, 0], [0, 0, 0]]) In [297]: first_nonzero(arr, axis=0, invalid_val=-1) Out[297]: array([ 0, 1, -1]) In [298]: first_nonzero(arr, axis=1, invalid_val=-1) Out[298]: array([ 0, 0, 1, -1])
Extending to cover all comparison operations
To find the first zeros
, simply use arr==0
as mask
for use in the function. For first ones equal to a certain value val
, use arr == val
and so on for all cases of comparisons
possible here.
To find the last ones matching a certain comparison criteria, we need to flip along that axis and use the same idea of using argmax
and then compensate for the flipping by offsetting from the axis length, as shown below -
def last_nonzero(arr, axis, invalid_val=-1): mask = arr!=0 val = arr.shape[axis] - np.flip(mask, axis=axis).argmax(axis=axis) - 1 return np.where(mask.any(axis=axis), val, invalid_val)
Sample runs -
In [320]: arr Out[320]: array([[1, 0, 0], [1, 1, 0], [0, 1, 0], [0, 0, 0]]) In [321]: last_nonzero(arr, axis=0, invalid_val=-1) Out[321]: array([ 1, 2, -1]) In [322]: last_nonzero(arr, axis=1, invalid_val=-1) Out[322]: array([ 0, 1, 1, -1])
Again, all cases of comparisons
possible here are covered by using the corresponding comparator to get mask
and then using within the listed function.
The problem, apparently 2D, can be solved by applying to the each row a function that finds the first non-zero element (exactly as in the question).
arr = np.array([[1,1,0],[1,1,0],[0,0,1],[0,0,0]]) def first_nonzero_index(array): """Return the index of the first non-zero element of array. If all elements are zero, return -1.""" fnzi = -1 # first non-zero index indices = np.flatnonzero(array) if (len(indices) > 0): fnzi = indices[0] return fnzi np.apply_along_axis(first_nonzero_index, axis=1, arr=arr) # result array([ 0, 0, 2, -1])
Explanation
The np.flatnonzero(array) method (as suggested in the comments by Henrik Koberg) returns "indices that are non-zero in the flattened version of array". The function calculates these indices and returns the first (or -1 if all elements are zero).
The apply_along_axis applys a function to 1-D slices along the given axis. Here since the axis is 1, the function is applied to the rows.
If we can assume that all rows of the input array contain at leas one non-zero element, the solution can be written calculated in one line:
np.apply_along_axis(lambda a: np.flatnonzero(a)[0], axis=1, arr=arr)
Possible variations
ORIGINAL ANSWER
Here is an alternative using numpy.argwhere
which returns the index of the non zero elements of an array:
array = np.array([0,0,0,1,2,3,0,0]) nonzero_indx = np.argwhere(array).squeeze() start, end = (nonzero_indx[0], nonzero_indx[-1]) print(array[start], array[end])
gives:
1 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With