I'm trying to set a number of different in a pandas DataFrame all to the same value. I thought I understood boolean indexing for pandas, but I haven't found any resources on this specific error.
import pandas as pd df = pd.DataFrame({'A': [1, 2, 3], 'B': ['a', 'b', 'f']}) mask = df.isin([1, 3, 12, 'a']) df[mask] = 30 Traceback (most recent call last): ... TypeError: Cannot do inplace boolean setting on mixed-types with a non np.nan value
Above, I want to replace all of the True
entries in the mask with the value 30
.
I could do df.replace
instead, but masking feels a bit more efficient and intuitive here. Can someone explain the error, and provide an efficient way to set all of the values?
You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.
Pandas DataFrame mask() MethodThe mask() method replaces the values of the rows where the condition evaluates to True. The mask() method is the opposite of the The where() method.
Boolean indexing helps us to select the data from the DataFrames using a boolean vector. We need a DataFrame with a boolean index to use the boolean indexing.
Pandas DataFrame bool() MethodThe bool() method returns a boolean value, True or False, reflecting the value of the DataFrame. This method will only work if the DataFrame has only 1 value, and that value must be either True or False, otherwise the bool() method will return an error.
You can't use the boolean mask on mixed dtypes for this unfortunately, you can use pandas where
to set the values:
In [59]: df = pd.DataFrame({'A': [1, 2, 3], 'B': ['a', 'b', 'f']}) mask = df.isin([1, 3, 12, 'a']) df = df.where(mask, other=30) df Out[59]: A B 0 1 a 1 30 30 2 3 30
Note: that the above will fail if you do inplace=True
in the where
method, so df.where(mask, other=30, inplace=True)
will raise:
TypeError: Cannot do inplace boolean setting on mixed-types with a non np.nan value
EDIT
OK, after a little misunderstanding you can still use where
y just inverting the mask:
In [2]: df = pd.DataFrame({'A': [1, 2, 3], 'B': ['a', 'b', 'f']}) mask = df.isin([1, 3, 12, 'a']) df.where(~mask, other=30) Out[2]: A B 0 30 30 1 2 b 2 30 f
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With