Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to select rows and replace some columns in pandas

Tags:

python

pandas

import pandas as pd
dic = {'A': [np.nan, 4, np.nan, 4], 'B': [9, 2, 5, 3], 'C': [0, 0, 5, 3]}
df = pd.DataFrame(dic)
df 

If I have data like below

     A  B   C
0   NaN 9   0
1   4.0 2   0
2   NaN 5   5
3   4.0 3   3

I want to select the raw that column A is NaN and replace column B's value with np.nan as follows.

    A   B   C
0   NaN NaN 0
1   4.0 2.0 0
2   NaN NaN 5
3   4.0 3.0 3

I tried to do df[df.A.isna()]["B"]=np.nan, but it didn't work.
According to this page, I should to select data by df.iloc. But the problem is that if df have numerous rows, I can't select data by input index.

like image 604
Dawei Avatar asked Dec 05 '22 12:12

Dawei


1 Answers

Option 1
You were pretty close actually. Use pd.Series.isnull on A and assign values to B using df.loc:

df.loc[df.A.isnull(), 'B'] = np.nan
df

     A    B  C
0  NaN  NaN  0
1  4.0  2.0  0
2  NaN  NaN  5
3  4.0  3.0  3

Option 2
np.where:

df['B'] = np.where(df.A.isnull(), np.nan, df.B)
df

     A    B  C
0  NaN  NaN  0
1  4.0  2.0  0
2  NaN  NaN  5
3  4.0  3.0  3
like image 174
cs95 Avatar answered Dec 08 '22 02:12

cs95