Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

set multiple column values to NaN based on condition

I'm trying to set values in 2 columns of my dataframe to null based on condition applied to one of the columns.

I know how to set the value of 1 column to null based on condition. I do that in the following example with col3. My question is how can I also set the value in col2 of the same row to null?

df = pd.DataFrame([['a',1, 10],
                   ['b',2, 20],
                   ['c',3, 30],
                   ['d',4, 40],
                   ['e',5, 50]], columns=['col1','col2','col3'])

df
Out[121]: 
  col1  col2  col3
0    a     1    10
1    b     2    20
2    c     3    30
3    d     4    40
4    e     5    50

df['col3'].mask(df['col3']<30,inplace=True)

df
Out[123]: 
  col1  col2  col3
0    a     1   NaN
1    b     2   NaN
2    c     3  30.0
3    d     4  40.0
4    e     5  50.0

Tried the following and it doesn't work

df['col2','col3'].mask(df['col3']<30,inplace=True)

My desired output is

  col1  col2  col3
0    a   NaN   NaN
1    b   NaN   NaN
2    c     3  30.0
3    d     4  40.0
4    e     5  50.0
like image 785
nebulousman Avatar asked Oct 30 '19 21:10

nebulousman


1 Answers

You can try df.loc found here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.loc.html.

This way you can state a condition to select the rows, and a list of columns that you want to apply the change to.

Using the constant NaN from numpy: https://docs.scipy.org/doc/numpy/reference/constants.html?highlight=nan#numpy.nan.

df.loc[df['col3']<30,['col2','col3']] = np.nan

The resulting df will be.

  col1  col2  col3
0    a   NaN   NaN
1    b   NaN   NaN
2    c   3.0  30.0
3    d   4.0  40.0
4    e   5.0  50.0
like image 196
Michael Bridges Avatar answered Nov 19 '22 06:11

Michael Bridges