Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What happened to python's ~ when working with boolean?

In a pandas DataFrame, I have a series of boolean values. In order to filter to rows where the boolean is True, I can use: df[df.column_x]

I thought in order to filter to only rows where the column is False, I could use: df[~df.column_x]. I feel like I have done this before, and have seen it as the accepted answer.

However, this fails because ~df.column_x converts the values to integers. See below.

import pandas as pd . # version 0.24.2

a = pd.Series(['a', 'a', 'a', 'a', 'b', 'a', 'b', 'b', 'b', 'b'])
b = pd.Series([True, True, True, True, True, False, False, False, False, False], dtype=bool)

c = pd.DataFrame(data=[a, b]).T
c.columns = ['Classification', 'Boolean']```

print(~c.Boolean)

0    -2
1    -2
2    -2
3    -2
4    -2
5    -1
6    -1
7    -1
8    -1
9    -1
Name: Boolean, dtype: object

print(~b)

0    False
1    False
2    False
3    False
4    False
5     True
6     True
7     True
8     True
9     True
dtype: bool

Basically, I can use c[~b], but not c[~c.Boolean]

Am I just dreaming that this use to work?

like image 533
K Jones Avatar asked May 10 '19 14:05

K Jones


People also ask

How does Python handle Boolean?

Numbers can be used as bool values by using Python's built-in bool() method. Any integer, floating-point number, or complex number having zero as a value is considered as False, while if they are having value as any positive or negative number then it is considered as True.

What is the rule regarding Boolean values in Python?

In Python, the two Boolean values are True and False (the capitalization must be exactly as shown), and the Python type is bool. In the first statement, the two operands evaluate to equal values, so the expression evaluates to True; in the second statement, 5 is not equal to 6, so we get False.

How is Boolean stored in Python?

In Python, boolean variables are defined by the True and False keywords. The output <class 'bool'> indicates the variable is a boolean data type. Note the keywords True and False must have an Upper Case first letter. Using a lowercase true returns an error.


1 Answers

Ah , since you created the c by using DataFrame constructor , then T,

1st let us look at what we have before T:

pd.DataFrame([a, b])
Out[610]: 
      0     1     2     3     4      5      6      7      8      9
0     a     a     a     a     b      a      b      b      b      b
1  True  True  True  True  True  False  False  False  False  False

So pandas will make each columns only have one dtype, if not it will convert to object .

After T what data type we have for each columns

The dtypes in your c :

c.dtypes
Out[608]: 
Classification    object
Boolean           object

Boolean columns became object type , that is why you get unexpected output for ~c.Boolean


How to fix it ? ---concat

c=pd.concat([a,b],1)
c.columns = ['Classification', 'Boolean']
~c.Boolean
Out[616]: 
0    False
1    False
2    False
3    False
4    False
5     True
6     True
7     True
8     True
9     True
Name: Boolean, dtype: bool
like image 54
BENY Avatar answered Sep 29 '22 11:09

BENY