Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas DataFrame set value on boolean mask

Tags:

I'm trying to set a number of different in a pandas DataFrame all to the same value. I thought I understood boolean indexing for pandas, but I haven't found any resources on this specific error.

import pandas as pd  df = pd.DataFrame({'A': [1, 2, 3], 'B': ['a', 'b', 'f']}) mask = df.isin([1, 3, 12, 'a']) df[mask] = 30 Traceback (most recent call last): ... TypeError: Cannot do inplace boolean setting on mixed-types with a non np.nan value 

Above, I want to replace all of the True entries in the mask with the value 30.

I could do df.replace instead, but masking feels a bit more efficient and intuitive here. Can someone explain the error, and provide an efficient way to set all of the values?

like image 944
Michael K Avatar asked May 29 '15 00:05

Michael K


People also ask

How do I change values based on conditions in pandas?

You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.

How do I mask data in pandas Python?

Pandas DataFrame mask() MethodThe mask() method replaces the values of the rows where the condition evaluates to True. The mask() method is the opposite of the The where() method.

Is boolean indexing possible in DataFrame?

Boolean indexing helps us to select the data from the DataFrames using a boolean vector. We need a DataFrame with a boolean index to use the boolean indexing.

How do you use a boolean in a data frame?

Pandas DataFrame bool() MethodThe bool() method returns a boolean value, True or False, reflecting the value of the DataFrame. This method will only work if the DataFrame has only 1 value, and that value must be either True or False, otherwise the bool() method will return an error.


1 Answers

You can't use the boolean mask on mixed dtypes for this unfortunately, you can use pandas where to set the values:

In [59]: df = pd.DataFrame({'A': [1, 2, 3], 'B': ['a', 'b', 'f']}) mask = df.isin([1, 3, 12, 'a']) df = df.where(mask, other=30) df  Out[59]:     A   B 0   1   a 1  30  30 2   3  30 

Note: that the above will fail if you do inplace=True in the where method, so df.where(mask, other=30, inplace=True) will raise:

TypeError: Cannot do inplace boolean setting on mixed-types with a non np.nan value

EDIT

OK, after a little misunderstanding you can still use where y just inverting the mask:

In [2]:     df = pd.DataFrame({'A': [1, 2, 3], 'B': ['a', 'b', 'f']}) mask = df.isin([1, 3, 12, 'a']) df.where(~mask, other=30)  Out[2]:     A   B 0  30  30 1   2   b 2  30   f 
like image 192
EdChum Avatar answered Sep 19 '22 14:09

EdChum