Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to replace NaN values by Zeroes in a column of a Pandas Dataframe?

I have a Pandas Dataframe as below:

      itm Date                  Amount  67    420 2012-09-30 00:00:00   65211 68    421 2012-09-09 00:00:00   29424 69    421 2012-09-16 00:00:00   29877 70    421 2012-09-23 00:00:00   30990 71    421 2012-09-30 00:00:00   61303 72    485 2012-09-09 00:00:00   71781 73    485 2012-09-16 00:00:00     NaN 74    485 2012-09-23 00:00:00   11072 75    485 2012-09-30 00:00:00  113702 76    489 2012-09-09 00:00:00   64731 77    489 2012-09-16 00:00:00     NaN 

When I try to apply a function to the Amount column, I get the following error:

ValueError: cannot convert float NaN to integer 

I have tried applying a function using .isnan from the Math Module I have tried the pandas .replace attribute I tried the .sparse data attribute from pandas 0.9 I have also tried if NaN == NaN statement in a function. I have also looked at this article How do I replace NA values with zeros in an R dataframe? whilst looking at some other articles. All the methods I have tried have not worked or do not recognise NaN. Any Hints or solutions would be appreciated.

like image 584
George Thompson Avatar asked Nov 08 '12 18:11

George Thompson


People also ask

Which method is used to replace NaN values with some value?

fillna() method is used to replace missing values (Nan or NA) with a specified value.


2 Answers

I believe DataFrame.fillna() will do this for you.

Link to Docs for a dataframe and for a Series.

Example:

In [7]: df Out[7]:            0         1 0       NaN       NaN 1 -0.494375  0.570994 2       NaN       NaN 3  1.876360 -0.229738 4       NaN       NaN  In [8]: df.fillna(0) Out[8]:            0         1 0  0.000000  0.000000 1 -0.494375  0.570994 2  0.000000  0.000000 3  1.876360 -0.229738 4  0.000000  0.000000 

To fill the NaNs in only one column, select just that column. in this case I'm using inplace=True to actually change the contents of df.

In [12]: df[1].fillna(0, inplace=True) Out[12]:  0    0.000000 1    0.570994 2    0.000000 3   -0.229738 4    0.000000 Name: 1  In [13]: df Out[13]:            0         1 0       NaN  0.000000 1 -0.494375  0.570994 2       NaN  0.000000 3  1.876360 -0.229738 4       NaN  0.000000 

EDIT:

To avoid a SettingWithCopyWarning, use the built in column-specific functionality:

df.fillna({1:0}, inplace=True) 
like image 51
Aman Avatar answered Oct 24 '22 18:10

Aman


It is not guaranteed that the slicing returns a view or a copy. You can do

df['column'] = df['column'].fillna(value) 
like image 27
rakesh Avatar answered Oct 24 '22 18:10

rakesh