Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to check if all values in the columns of a numpy matrix are the same?

I want to check if all values in the columns of a numpy array/matrix are the same. I tried to use reduce of the ufunc equal, but it doesn't seem to work in all cases:

In [55]: a = np.array([[1,1,0],[1,-1,0],[1,0,0],[1,1,0]])  In [56]: a Out[56]:  array([[ 1,  1,  0],        [ 1, -1,  0],        [ 1,  0,  0],        [ 1,  1,  0]])  In [57]: np.equal.reduce(a) Out[57]: array([ True, False,  True], dtype=bool)  In [58]: a = np.array([[1,1,0],[1,0,0],[1,0,0],[1,1,0]])  In [59]: a Out[59]:  array([[1, 1, 0],        [1, 0, 0],        [1, 0, 0],        [1, 1, 0]])  In [60]: np.equal.reduce(a) Out[60]: array([ True,  True,  True], dtype=bool) 

Why does the middle column in the second case also evaluate to True, while it should be False?

Thanks for any help!

like image 743
tobigue Avatar asked Feb 13 '13 17:02

tobigue


People also ask

How do you check if all values the same in NumPy array?

The numpy. array_equiv() function can also be used to check whether two arrays are equal or not in Python. The numpy. array_equiv() function returns True if both arrays have the same shape and all the elements are equal, and returns False otherwise.

How do you check if two NumPy arrays have the same value?

To check if two NumPy arrays A and B are equal: Use a comparison operator (==) to form a comparison array. Check if all the elements in the comparison array are True.

How do you compare two matrices with NumPy?

Method 1: We generally use the == operator to compare two NumPy arrays to generate a new array object. Call ndarray. all() with the new array object as ndarray to return True if the two NumPy arrays are equivalent.

How can I get the number of rows and columns of a matrix in NumPy?

In the NumPy with the help of shape() function, we can find the number of rows and columns. In this function, we pass a matrix and it will return row and column number of the matrix. Return: The number of rows and columns.


2 Answers

In [45]: a Out[45]:  array([[1, 1, 0],        [1, 0, 0],        [1, 0, 0],        [1, 1, 0]]) 

Compare each value to the corresponding value in the first row:

In [46]: a == a[0,:] Out[46]:  array([[ True,  True,  True],        [ True, False,  True],        [ True, False,  True],        [ True,  True,  True]], dtype=bool) 

A column shares a common value if all the values in that column are True:

In [47]: np.all(a == a[0,:], axis = 0) Out[47]: array([ True, False,  True], dtype=bool) 

The problem with np.equal.reduce can be seen by micro-analyzing what happens when it is applied to [1, 0, 0, 1]:

In [49]: np.equal.reduce([1, 0, 0, 1]) Out[50]: True 

The first two items, 1 and 0 are tested for equality and the result is False:

In [51]: np.equal.reduce([False, 0, 1]) Out[51]: True 

Now False and 0 are tested for equality and the result is True:

In [52]: np.equal.reduce([True, 1]) Out[52]: True 

But True and 1 are equal, so the total result is True, which is not the desired outcome.

The problem is that reduce tries to accumulate the result "locally", while we want a "global" test like np.all.

like image 137
unutbu Avatar answered Oct 09 '22 06:10

unutbu


Given ubuntu's awesome explanation, you can use reduce to solve your problem, but you have to apply it to bitwise_and and bitwise_or rather than equal. As a consequence, this will not work with floating point arrays:

In [60]: np.bitwise_and.reduce(a) == a[0] Out[60]: array([ True, False,  True], dtype=bool)  In [61]: np.bitwise_and.reduce(b) == b[0] Out[61]: array([ True, False,  True], dtype=bool) 

Basically, you are comparing the bits of each element in the column. Identical bits are unchanged. Different bits are set to zero. This way, any number that has a zero instead of a one bit will change the reduced value. bitwise_and will not trap the case where bits are introduced rather than removed:

In [62]: c = np.array([[1,0,0],[1,0,0],[1,0,0],[1,1,0]])  In [63]: c Out[63]:  array([[1, 0, 0],        [1, 0, 0],        [1, 0, 0],        [1, 1, 0]])  In [64]: np.bitwise_and.reduce(c) == c[0] Out[64]: array([ True,  True,  True], dtype=bool) 

The second coumn is clearly wrong. We need to use bitwise_or to trap new bits:

In [66]: np.bitwise_or.reduce(c) == c[0] Out[66]: array([ True, False,  True], dtype=bool) 

Final Answer

In [69]: np.logical_and(np.bitwise_or.reduce(a) == a[0], np.bitwise_and.reduce(a) == a[0]) Out[69]: array([ True, False,  True], dtype=bool)  In [70]: np.logical_and(np.bitwise_or.reduce(b) == b[0], np.bitwise_and.reduce(b) == b[0]) Out[70]: array([ True, False,  True], dtype=boo  In [71]: np.logical_and(np.bitwise_or.reduce(c) == c[0], np.bitwise_and.reduce(c) == c[0]) Out[71]: array([ True, False,  True], dtype=bool) 

This method is more restrictive and less elegant than ubunut's suggestion of using all, but it has the advantage of not creating enormous temporary arrays if your input is enormous. The temporary arrays should only be as big as the first row of your matrix.

EDIT

Based on this Q/A and the bug I filed with numpy, the solution provided only works because your array contains zeros and ones. As it happens, the bitwise_and.reduce() operations shown can only ever return zero or one because bitwise_and.identity is 1, not -1. I am keeping this answer in the hope that numpy gets fixed and the answer becomes valid.

Edit

Looks like there will in fact be a change to numpy soon. Certainly to bitwise_and.identity, and also possibly an optional parameter to reduce.

Edit

Good news everyone. The identity for np.bitwise_and has been set to -1 as of version 1.12.0.

like image 30
Mad Physicist Avatar answered Oct 09 '22 07:10

Mad Physicist