I am experimenting with the numpy.where(condition[, x, y]) function.
 From the numpy documentation, I learn that if you give just one array as input, it should return the indices where the array is non-zero (i.e. "True"):
If only condition is given, return the tuple condition.nonzero(), the indices where condition is True.
But if try it, it returns me a tuple of two elements, where the first is the wanted list of indices, and the second is a null element:
>>> import numpy as np >>> array = np.array([1,2,3,4,5,6,7,8,9]) >>> np.where(array>4) (array([4, 5, 6, 7, 8]),) # notice the comma before the last parenthesis   so the question is: why? what is the purpose of this behaviour? in what situation this is useful? Indeed, to get the wanted list of indices I have to add the indexing, as in np.where(array>4)[0], which seems... "ugly".
ADDENDUM
I understand (from some answers) that it is actually a tuple of just one element. Still I don't understand why to give the output in this way. To illustrate how this is not ideal, consider the following error (which motivated my question in the first place):
>>> import numpy as np >>> array = np.array([1,2,3,4,5,6,7,8,9]) >>> pippo = np.where(array>4) >>> pippo + 1 Traceback (most recent call last):   File "<stdin>", line 1, in <module> TypeError: can only concatenate tuple (not "int") to tuple   so that you need to do some indexing to access the actual array of indices:
>>> pippo[0] + 1 array([5, 6, 7, 8, 9]) 
                numpy. where returns a tuple because each element of the tuple refers to a dimension. As you can see, the first element of the tuple refers to the first dimension of relevant elements; the second element refers to the second dimension.
A numpy array is a grid of values, all of the same type, and is indexed by a tuple of nonnegative integers. The number of dimensions is the rank of the array; the shape of an array is a tuple of integers giving the size of the array along each dimension.
It returns a new numpy array, after filtering based on a condition, which is a numpy-like array of boolean values. For example, if condition is array([[True, True, False]]) , and our array is a = ndarray([[1, 2, 3]]) , on applying a condition to array ( a[:, condition] ), we will get the array ndarray([[1 2]]) .
where() in Python. The numpy. where() function returns the indices of elements in an input array where the given condition is satisfied.
In Python (1) means just 1.  () can be freely added to group numbers and expressions for human readability (e.g. (1+3)*3 v (1+3,)*3).  Thus to denote a 1 element tuple it uses (1,) (and requires you to use it as well).
Thus
(array([4, 5, 6, 7, 8]),)   is a one element tuple, that element being an array.
If you applied where to a 2d array, the result would be a 2 element tuple.
The result of where is such that it can be plugged directly into an indexing slot, e.g.
a[where(a>0)] a[a>0]   should return the same things
as would
I,J = where(a>0)   # a is 2d a[I,J] a[(I,J)]   Or with your example:
In [278]: a=np.array([1,2,3,4,5,6,7,8,9]) In [279]: np.where(a>4) Out[279]: (array([4, 5, 6, 7, 8], dtype=int32),)  # tuple  In [280]: a[np.where(a>4)] Out[280]: array([5, 6, 7, 8, 9])  In [281]: I=np.where(a>4) In [282]: I Out[282]: (array([4, 5, 6, 7, 8], dtype=int32),) In [283]: a[I] Out[283]: array([5, 6, 7, 8, 9])  In [286]: i, = np.where(a>4)   # note the , on LHS In [287]: i Out[287]: array([4, 5, 6, 7, 8], dtype=int32)  # not tuple In [288]: a[i] Out[288]: array([5, 6, 7, 8, 9]) In [289]: a[(i,)] Out[289]: array([5, 6, 7, 8, 9])   ======================
np.flatnonzero shows the correct way of returning just one array, regardless of the dimensions of the input array.
In [299]: np.flatnonzero(a>4) Out[299]: array([4, 5, 6, 7, 8], dtype=int32) In [300]: np.flatnonzero(a>4)+10 Out[300]: array([14, 15, 16, 17, 18], dtype=int32)   It's doc says:
This is equivalent to a.ravel().nonzero()[0]
In fact that is literally what the function does.
By flattening a removes the question of what to do with multiple dimensions.  And then it takes the response out of the tuple, giving you a plain array. With flattening it doesn't have make a special case for 1d arrays.
===========================
@Divakar suggests np.argwhere:
In [303]: np.argwhere(a>4) Out[303]:  array([[4],        [5],        [6],        [7],        [8]], dtype=int32)   which does np.transpose(np.where(a>4))
Or if you don't like the column vector, you could transpose it again
In [307]: np.argwhere(a>4).T Out[307]: array([[4, 5, 6, 7, 8]], dtype=int32)   except now it is a 1xn array.
We could just as well have wrapped where in array:
In [311]: np.array(np.where(a>4)) Out[311]: array([[4, 5, 6, 7, 8]], dtype=int32)   Lots of ways of taking an array out the where tuple ([0], i,=, transpose, array, etc).
Short answer: np.where is designed to have consistent output regardless of the dimension of the array.
A two-dimensional array has two indices, so the result of np.where is a length-2 tuple containing the relevant indices. This generalizes to a length-3 tuple for 3-dimensions, a length-4 tuple for 4 dimensions, or a length-N tuple for N dimensions. By this rule, it is clear that in 1 dimension, the result should be a length-1 tuple.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With