Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: Appending a row of boolean values to df using `loc` changes to `int`

Consider df:

In [2098]: df = pd.DataFrame({'a': [1,2], 'b':[3,4]})

In [2099]: df
Out[2099]: 
   a  b
0  1  3
1  2  4

Now, I try to append a list of values to df:

In [2102]: df.loc[2] = [3, 4]

In [2103]: df
Out[2103]: 
   a  b
0  1  3
1  2  4
2  3  4

All's good so far.

But now when I try to append a row with list of boolean values, it converts it into int:

In [2104]: df.loc[3] = [True, False]

In [2105]: df
Out[2105]: 
   a  b
0  1  3
1  2  4
2  3  4
3  1  0

I know I can convert my df into str and can then append boolean values, like:

In [2131]: df = df.astype(str)
In [2133]: df.loc[3] = [True, False]

In [2134]: df
Out[2134]: 
      a      b
0     1      3
1     2      4
3  True  False

But, I want to know the reason behind this behaviour. Why is it not automatically changing the dtypes of columns to object when I append boolean to it?

My Pandas version is:

In [2150]: pd.__version__
Out[2150]: '1.1.0'
like image 604
Mayank Porwal Avatar asked Dec 23 '20 08:12

Mayank Porwal


1 Answers

Why is it not automatically changing the dtypes of columns to object when I append boolean to it?

Because the type are being upcasted (see upcasting), from the documentation:

Types can potentially be upcasted when combined with other types, meaning they are promoted from the current type (e.g. int to float).

Upcasting works according to numpy rules:

Upcasting is always according to the numpy rules. If two different dtypes are involved in an operation, then the more general one will be used as the result of the operation.

To understand how the numpy rules are applied you can use the function find_common_type, as below:

res = np.find_common_type([bool, np.bool], [np.int32, np.int64])
print(res)

Output

int64
like image 137
Dani Mesejo Avatar answered Oct 21 '22 15:10

Dani Mesejo