I want to check if all values in the columns of a numpy array/matrix are the same. I tried to use reduce
of the ufunc equal
, but it doesn't seem to work in all cases:
In [55]: a = np.array([[1,1,0],[1,-1,0],[1,0,0],[1,1,0]]) In [56]: a Out[56]: array([[ 1, 1, 0], [ 1, -1, 0], [ 1, 0, 0], [ 1, 1, 0]]) In [57]: np.equal.reduce(a) Out[57]: array([ True, False, True], dtype=bool) In [58]: a = np.array([[1,1,0],[1,0,0],[1,0,0],[1,1,0]]) In [59]: a Out[59]: array([[1, 1, 0], [1, 0, 0], [1, 0, 0], [1, 1, 0]]) In [60]: np.equal.reduce(a) Out[60]: array([ True, True, True], dtype=bool)
Why does the middle column in the second case also evaluate to True
, while it should be False
?
Thanks for any help!
The numpy. array_equiv() function can also be used to check whether two arrays are equal or not in Python. The numpy. array_equiv() function returns True if both arrays have the same shape and all the elements are equal, and returns False otherwise.
To check if two NumPy arrays A and B are equal: Use a comparison operator (==) to form a comparison array. Check if all the elements in the comparison array are True.
Method 1: We generally use the == operator to compare two NumPy arrays to generate a new array object. Call ndarray. all() with the new array object as ndarray to return True if the two NumPy arrays are equivalent.
In the NumPy with the help of shape() function, we can find the number of rows and columns. In this function, we pass a matrix and it will return row and column number of the matrix. Return: The number of rows and columns.
In [45]: a Out[45]: array([[1, 1, 0], [1, 0, 0], [1, 0, 0], [1, 1, 0]])
Compare each value to the corresponding value in the first row:
In [46]: a == a[0,:] Out[46]: array([[ True, True, True], [ True, False, True], [ True, False, True], [ True, True, True]], dtype=bool)
A column shares a common value if all the values in that column are True:
In [47]: np.all(a == a[0,:], axis = 0) Out[47]: array([ True, False, True], dtype=bool)
The problem with np.equal.reduce
can be seen by micro-analyzing what happens when it is applied to [1, 0, 0, 1]
:
In [49]: np.equal.reduce([1, 0, 0, 1]) Out[50]: True
The first two items, 1
and 0
are tested for equality and the result is False
:
In [51]: np.equal.reduce([False, 0, 1]) Out[51]: True
Now False
and 0
are tested for equality and the result is True
:
In [52]: np.equal.reduce([True, 1]) Out[52]: True
But True
and 1 are equal, so the total result is True
, which is not the desired outcome.
The problem is that reduce
tries to accumulate the result "locally", while we want a "global" test like np.all
.
Given ubuntu's awesome explanation, you can use reduce
to solve your problem, but you have to apply it to bitwise_and
and bitwise_or
rather than equal
. As a consequence, this will not work with floating point arrays:
In [60]: np.bitwise_and.reduce(a) == a[0] Out[60]: array([ True, False, True], dtype=bool) In [61]: np.bitwise_and.reduce(b) == b[0] Out[61]: array([ True, False, True], dtype=bool)
Basically, you are comparing the bits of each element in the column. Identical bits are unchanged. Different bits are set to zero. This way, any number that has a zero instead of a one bit will change the reduced value. bitwise_and
will not trap the case where bits are introduced rather than removed:
In [62]: c = np.array([[1,0,0],[1,0,0],[1,0,0],[1,1,0]]) In [63]: c Out[63]: array([[1, 0, 0], [1, 0, 0], [1, 0, 0], [1, 1, 0]]) In [64]: np.bitwise_and.reduce(c) == c[0] Out[64]: array([ True, True, True], dtype=bool)
The second coumn is clearly wrong. We need to use bitwise_or
to trap new bits:
In [66]: np.bitwise_or.reduce(c) == c[0] Out[66]: array([ True, False, True], dtype=bool)
Final Answer
In [69]: np.logical_and(np.bitwise_or.reduce(a) == a[0], np.bitwise_and.reduce(a) == a[0]) Out[69]: array([ True, False, True], dtype=bool) In [70]: np.logical_and(np.bitwise_or.reduce(b) == b[0], np.bitwise_and.reduce(b) == b[0]) Out[70]: array([ True, False, True], dtype=boo In [71]: np.logical_and(np.bitwise_or.reduce(c) == c[0], np.bitwise_and.reduce(c) == c[0]) Out[71]: array([ True, False, True], dtype=bool)
This method is more restrictive and less elegant than ubunut's suggestion of using all
, but it has the advantage of not creating enormous temporary arrays if your input is enormous. The temporary arrays should only be as big as the first row of your matrix.
EDIT
Based on this Q/A and the bug I filed with numpy, the solution provided only works because your array contains zeros and ones. As it happens, the bitwise_and.reduce()
operations shown can only ever return zero or one because bitwise_and.identity
is 1
, not -1
. I am keeping this answer in the hope that numpy
gets fixed and the answer becomes valid.
Edit
Looks like there will in fact be a change to numpy soon. Certainly to bitwise_and.identity
, and also possibly an optional parameter to reduce.
Edit
Good news everyone. The identity for np.bitwise_and
has been set to -1
as of version 1.12.0
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With