Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How exactly does the behavior of Python bool and numpy bool_ differ?

Tags:

python

numpy

TLDR: is-comparison works with Python bool's and doesn't work with numpy bool_'s. Are any another differences exist?


I ran into a strange behaviour of booleans couple of days ago. When I tried to use is-comparison for this numpy array:

arr1 = np.array([1,0,2,0], dtype=bool)
arr1

Out[...]: array([ True, False,  True, False])

(These variable names are based on fiction and any similarity to real variable names or production code are purely coincidental)

I saw this result:

arr1 is True

Out[...]: False

It was logical because arr1 is not True or False, it is numpy array. I checked this:

arr1 == True

Out[...]: array([ True, False,  True, False])

This worked as expected. I mentioned this cute behaviour and forgot it immediately. Next day I checked True-ness of the array elements:

[elem is False for elem in arr1]

and it returns me this!

Out[...]: [False, False, False, False]

I was really confused because I remembered that in Python arrays (I thought that the problem is in arrays behaviour):

arr2 = [True, False, True, False]
[elem is False for elem in arr2]

it works:

Out[...]: [False, True, False, True]

Moreover, it was working in my another numpy array:

very_cunning_arr = np.array([1, False, 2, False, []])
[elem is False for elem in very_cunning_arr]

Out[...]: [False, True, False, True, False]

When I dived into my array, I unraveled that very_cunning_arr was constructed by numpy.object because of couple of non-numeric elements so it contained Python bools and arr1 was constructed by numpy.bool_. So I checked their behaviour:

numpy_waka = np.bool_(True)
numpy_waka

Out[...]: True

python_waka = True
python_waka

Out[...]: True

[numpy_waka is True, python_waka is True]

And I finally found the difference:

Out[...]: [False, True]

After all of these I have two questions:

  1. Do numpy.bool_ and bool have some another differences in their common behaviour? (I know that numpy.bool_ has many numpy functions and parameters, like .T and others)
  2. How one can check if the numpy array contains only numpy booleans, without Pythonic bools?

(PS: Yes, NOW I know that comparing to True/False with is is bad):

Don't compare boolean values to True or False using ==.

Yes:   if greeting:
No:    if greeting == True:
Worse: if greeting is True:

Edit 1: As mentioned in another question, numpy has its own bool_ type. But the details of this question are bit different: I found that is-statements works differently, but prior to this difference - is there something else is different in common bool_ and bool behaviour? If yes, what exactly?

like image 697
vurmux Avatar asked Apr 29 '19 14:04

vurmux


1 Answers

In [119]: np.array([1,0,2,0],dtype=bool)                                             
Out[119]: array([ True, False,  True, False])

In [120]: np.array([1, False, 2, False, []])                                         
Out[120]: array([1, False, 2, False, list([])], dtype=object)

Note the dtype. With object dtype, the elements of the array are Python objects, just like they are in the source list.

In the first case the array dtype is boolean. The elements represent boolean values, but they are not, themselves, Python True/False objects. Strictly speaking Out[119] does not contain np.bool_ objects. Out[119][1] is type bool_, but that's the result of the 'unboxing'. It's what ndarray indexing produces when you ask for an element. (This 'unboxing' distinction is true for all non-object dtypes.)

Normally we don't create dtype objects, preferring np.array(True), but to follow your example:

In [124]: np.bool_(True)                                                             
Out[124]: True
In [125]: type(np.bool_(True))                                                       
Out[125]: numpy.bool_
In [126]: np.bool_(True) is True                                                     
Out[126]: False
In [127]: type(True)                                                                 
Out[127]: bool

is is a strict test, not just for equality, but identity. Objects of different classes don't satisfy a is test. Objects can satisfy the == test without satisfying the is test.

Let's play with the object dtype array:

In [129]: np.array([1, False, 2, np.bool_(False), []])                               
Out[129]: array([1, False, 2, False, list([])], dtype=object)
In [130]: [i is False for i in _]                                                    
Out[130]: [False, True, False, False, False]

In the Out[129] display, the two False objects display the same, but the Out[130] test shows they are different.


To focus on your questions.

  • np.bool_(False) is a unique object, but distinct from False. As you note it has many of the same attributes/methods as np.array(False).

  • If the array dtype is bool it does not contain Python bool objects. It doesn't even contain np.bool_ objects. However indexing such an array will produce a bool_. And applying item() to that in turn produces a Python bool.

  • If the array object dtype, it most likely will contain Python bool, unless you've taken special steps to include bool_ objects.

like image 78
hpaulj Avatar answered Oct 13 '22 16:10

hpaulj