Let's say I have a dataframe:
A B C D E F
0 x R i R nan h
1 z g j x a nan
2 z h nan y nan nan
3 x g nan nan nan nan
4 x x h x s f
I want to replace all the cells where:
df.loc[0] == 'R'
)!= 'x'
)with np.nan
.
Essentially I want to do:
df.loc[2:,df.loc[0]=='R']!='x' = np.nan
I get the error:
SyntaxError: can't assign to comparison
I just don't know how the syntax is supposed to be.
I've tried
df[df.loc[2:,df.loc[0]=='R']!='x']
but this doesn't list the values I want.
The fillna() function is used to fill NA/NaN values using the specified method. Value to use to fill holes (e.g. 0), alternately a dict/Series/DataFrame of values specifying which value to use for each index (for a Series) or column (for a DataFrame). Values not in the dict/Series/DataFrame will not be filled.
You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.
You can use the fillna() function to replace NaN values in a pandas DataFrame.
In computing, NaN (/næn/), standing for Not a Number, is a member of a numeric data type that can be interpreted as a value that is undefined or unrepresentable, especially in floating-point arithmetic.
mask = df.ne('x') & df.iloc[0].eq('R')
mask.iloc[:2] = False
df.mask(mask)
A B C D E F
0 x R i R NaN h
1 z g j x a NaN
2 z NaN NaN NaN NaN NaN
3 x NaN NaN NaN NaN NaN
4 x x h x s f
Build the mask up
df.ne('x')
gives
A B C D E F
0 False True True True True True
1 True True True False True True
2 True True True True True True
3 False True True True True True
4 False False True False True True
But we want that in conjunction with df.iloc[0].eq('R')
which is a Series
. Turns out that if we just &
those two together, it will align the Series
index with the columns of the mask in step 1.
A False
B True
C False
D True
E False
F False
Name: 0, dtype: bool
# &
A B C D E F
0 False True True True True True
1 True True True False True True
2 True True True True True True
3 False True True True True True
4 False False True False True True
# GIVES YOU
A B C D E F
0 False True False True False False
1 False True False False False False
2 False True False True False False
3 False True False True False False
4 False False False False False False
Finally, we want to exclude the first two rows from these shenanigans so...
mask.iloc[:2] = False
Try with:
mask = df.iloc[0] !='R'
df.loc[2:, mask] = df.loc[2:,mask].where(df.loc[2:,mask]=='x')
Output:
A B C D E F
0 x R i R NaN h
1 z g j x a NaN
2 NaN h NaN y NaN NaN
3 x g NaN NaN NaN NaN
4 x x NaN x NaN NaN
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With