Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replacing NaNs in a dataframe with a string value

I want to replace the missing value in one column of my df with "missing value". I tried

result['emp_title'].fillna('missing')

or

result['emp_title'] = result['emp_title'].replace({ np.nan:'missing'})

the second one works, since when i count missing value after this code:

result['emp_title'].isnull().sum()

it gave me 0. However, the first one does not work as I expected, which did not give me a 0, instead of the previous count for missing value. Why the first one does not work? Thank you!

like image 809
Pumpkin C Avatar asked Jan 03 '23 13:01

Pumpkin C


1 Answers

You need to fill inplace, or assign:

result['emp_title'].fillna('missing', inplace=True)

or

result['emp_title'] = result['emp_title'].fillna('missing') 

MVCE:

In [1697]: df = pd.DataFrame({'Col1' : [1, 2, 3, np.nan, 4, 5, np.nan]})

In [1702]: df.fillna('missing'); df # changes not seen in the original
Out[1702]: 
   Col1
0   1.0
1   2.0
2   3.0
3   NaN
4   4.0
5   5.0
6   NaN

In [1703]: df.fillna('missing', inplace=True); df
Out[1703]: 
      Col1
0        1
1        2
2        3
3  missing
4        4
5        5
6  missing

You should be aware that if you are trying to apply fillna to slices, don't use inplace=True, instead, use df.loc/iloc and assign to sub-slices:

In [1707]: df.Col1.iloc[:5].fillna('missing', inplace=True); df # doesn't work
Out[1707]: 
   Col1
0   1.0
1   2.0
2   3.0
3   NaN
4   4.0
5   5.0
6   NaN

In [1709]: df.Col1.iloc[:5] = df.Col1.iloc[:5].fillna('missing')

In [1710]: df
Out[1710]: 
      Col1
0        1
1        2
2        3
3  missing
4        4
5        5
6      NaN
like image 55
cs95 Avatar answered Jan 14 '23 19:01

cs95