How to check if all values in the columns of a numpy matrix are the same?

Tags:

I want to check if all values in the columns of a numpy array/matrix are the same. I tried to use reduce of the ufunc equal, but it doesn't seem to work in all cases:

In [55]: a = np.array([[1,1,0],[1,-1,0],[1,0,0],[1,1,0]])  In [56]: a Out[56]:  array([[ 1,  1,  0],        [ 1, -1,  0],        [ 1,  0,  0],        [ 1,  1,  0]])  In [57]: np.equal.reduce(a) Out[57]: array([ True, False,  True], dtype=bool)  In [58]: a = np.array([[1,1,0],[1,0,0],[1,0,0],[1,1,0]])  In [59]: a Out[59]:  array([[1, 1, 0],        [1, 0, 0],        [1, 0, 0],        [1, 1, 0]])  In [60]: np.equal.reduce(a) Out[60]: array([ True,  True,  True], dtype=bool)

Why does the middle column in the second case also evaluate to True, while it should be False?

Thanks for any help!

743

asked Feb 13 '13 17:02

2 Answers

In [45]: a Out[45]:  array([[1, 1, 0],        [1, 0, 0],        [1, 0, 0],        [1, 1, 0]])

Compare each value to the corresponding value in the first row:

In [46]: a == a[0,:] Out[46]:  array([[ True,  True,  True],        [ True, False,  True],        [ True, False,  True],        [ True,  True,  True]], dtype=bool)

A column shares a common value if all the values in that column are True:

In [47]: np.all(a == a[0,:], axis = 0) Out[47]: array([ True, False,  True], dtype=bool)

The problem with np.equal.reduce can be seen by micro-analyzing what happens when it is applied to [1, 0, 0, 1]:

In [49]: np.equal.reduce([1, 0, 0, 1]) Out[50]: True

The first two items, 1 and 0 are tested for equality and the result is False:

In [51]: np.equal.reduce([False, 0, 1]) Out[51]: True

Now False and 0 are tested for equality and the result is True:

In [52]: np.equal.reduce([True, 1]) Out[52]: True

But True and 1 are equal, so the total result is True, which is not the desired outcome.

The problem is that reduce tries to accumulate the result "locally", while we want a "global" test like np.all.

137

answered Oct 09 '22 06:10

Given ubuntu's awesome explanation, you can use reduce to solve your problem, but you have to apply it to bitwise_and and bitwise_or rather than equal. As a consequence, this will not work with floating point arrays:

In [60]: np.bitwise_and.reduce(a) == a[0] Out[60]: array([ True, False,  True], dtype=bool)  In [61]: np.bitwise_and.reduce(b) == b[0] Out[61]: array([ True, False,  True], dtype=bool)

Basically, you are comparing the bits of each element in the column. Identical bits are unchanged. Different bits are set to zero. This way, any number that has a zero instead of a one bit will change the reduced value. bitwise_and will not trap the case where bits are introduced rather than removed:

In [62]: c = np.array([[1,0,0],[1,0,0],[1,0,0],[1,1,0]])  In [63]: c Out[63]:  array([[1, 0, 0],        [1, 0, 0],        [1, 0, 0],        [1, 1, 0]])  In [64]: np.bitwise_and.reduce(c) == c[0] Out[64]: array([ True,  True,  True], dtype=bool)

The second coumn is clearly wrong. We need to use bitwise_or to trap new bits:

In [66]: np.bitwise_or.reduce(c) == c[0] Out[66]: array([ True, False,  True], dtype=bool)

Final Answer

In [69]: np.logical_and(np.bitwise_or.reduce(a) == a[0], np.bitwise_and.reduce(a) == a[0]) Out[69]: array([ True, False,  True], dtype=bool)  In [70]: np.logical_and(np.bitwise_or.reduce(b) == b[0], np.bitwise_and.reduce(b) == b[0]) Out[70]: array([ True, False,  True], dtype=boo  In [71]: np.logical_and(np.bitwise_or.reduce(c) == c[0], np.bitwise_and.reduce(c) == c[0]) Out[71]: array([ True, False,  True], dtype=bool)

This method is more restrictive and less elegant than ubunut's suggestion of using all, but it has the advantage of not creating enormous temporary arrays if your input is enormous. The temporary arrays should only be as big as the first row of your matrix.

EDIT

Based on this Q/A and the bug I filed with numpy, the solution provided only works because your array contains zeros and ones. As it happens, the bitwise_and.reduce() operations shown can only ever return zero or one because bitwise_and.identity is 1, not -1. I am keeping this answer in the hope that numpy gets fixed and the answer becomes valid.

Edit

Looks like there will in fact be a change to numpy soon. Certainly to bitwise_and.identity, and also possibly an optional parameter to reduce.

Edit

Good news everyone. The identity for np.bitwise_and has been set to -1 as of version 1.12.0.

answered Oct 09 '22 07:10

Mad Physicist

Related questions
                            
                                Difference between a 'for' loop and map
                            
                                Should I worry about circular references in Python?
                            
                                Python: How do you convert a datetime/timestamp from one timezone to another timezone?
                            
                                Jupyter | How to rotate 3D graph [duplicate]
                            
                                Generating sine wave sound in Python
                            
                                Python: Why is global needed only on assignment and not on reads?
                            
                                Python Mysql, "commands out of sync; you can't run this command now"
                            
                                django-rest-framework: api versioning
                            
                                Python sqlite3.OperationalError: no such table:
                            
                                In Pandas, how to delete rows from a Data Frame based on another Data Frame?
                            
                                Python: Unpacking an inner nested tuple/list while still getting its index number
                            
                                How to get an UTC date string in Python? [duplicate]
                            
                                How to specify upper and lower limits when using numpy.random.normal
                            
                                Python built-in function "compile". What is it used for?
                            
                                How can I overlay two graphs in Seaborn?
                            
                                Why does my Python code print the extra characters "ï»¿" when reading from a text file?
                            
                                PySpark row-wise function composition
                            
                                gcc error trying to install PIL in a Python2.6 virtualenv
                            
                                Django Query using .order_by() and .latest()
                            
                                sampling random floats on a range in numpy

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to check if all values in the columns of a numpy matrix are the same?

Tags:

python

matrix

numpy

tobigue

People also ask

2 Answers

unutbu

Mad Physicist

Recent Activity

Donate For Us