Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using fillna method on multiple columns of a Pandas DataFrame failed

Tags:

python

pandas

na

Why would this operation fail? For example:

a = pd.DataFrame({'a': [1,2,np.nan, np.nan],
                 'b': [5,np.nan,6, np.nan],
                 'c': [5, 1, 5, 2]})


a[['a', 'b']].fillna(0, inplace=True)

and gave me this warning:

SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

But a was still Filled with NAs as before. However, if I call .fillna() on each column separately, there'd be no issue. How can I fill NA values on multiple columns in one shot?

like image 308
James Wong Avatar asked Apr 27 '17 03:04

James Wong


People also ask

How do I Fillna multiple columns?

We can use fillna() function to impute the missing values of a data frame to every column defined by a dictionary of values. The limitation of this method is that we can only use constant values to be filled.

What does Fillna () method do?

The fillna() method replaces the NULL values with a specified value. The fillna() method returns a new DataFrame object unless the inplace parameter is set to True , in that case the fillna() method does the replacing in the original DataFrame instead.

How do you replace NaN values in multiple columns in Python?

replace() on All Columns. You can also use df. replace(np. nan,0) to replace all NaN values with zero.


2 Answers

These answers are guided by the fact that OP wanted an in place edit of an existing dataframe. Usually, I overwrite the existing dataframe with a new one.


Use pandas.DataFrame.fillna with a dict

Pandas fillna allows us to pass a dictionary that specifies which columns will be filled in and with what.

So this will work

a.fillna({'a': 0, 'b': 0})

     a    b  c
0  1.0  5.0  5
1  2.0  0.0  1
2  0.0  6.0  5
3  0.0  0.0  2

With an in place edit made possible with:

a.fillna({'a': 0, 'b': 0}, inplace=True)

NOTE: I would've just done this a = a.fillna({'a': 0, 'b': 0})

We don't save text length but we could get cute using dict.fromkeys

a.fillna(dict.fromkeys(['a', 'b'], 0), inplace=True)

loc

We can use the same format as the OP but place it in the correct columns using loc

a.loc[:, ['a', 'b']] = a[['a', 'b']].fillna(0)

a

     a    b  c
0  1.0  5.0  5
1  2.0  0.0  1
2  0.0  6.0  5
3  0.0  0.0  2

pandas.DataFrame.update

Explicitly made to make in place edits with the non-null values of another dataframe

a.update(a[['a', 'b']].fillna(0))

a

     a    b  c
0  1.0  5.0  5
1  2.0  0.0  1
2  0.0  6.0  5
3  0.0  0.0  2

Iterate column by column

I really don't like this approach because it is unnecessarily verbose

for col in ['a', 'b']:
    a[col].fillna(0, inplace=True)

a

     a    b  c
0  1.0  5.0  5
1  2.0  0.0  1
2  0.0  6.0  5
3  0.0  0.0  2

fillna with a dataframe

Use the result of a[['a', 'b']].fillna(0) as the input for another fillna. In my opinion, this is silly. Just use the first option.

a.fillna(a[['a', 'b']].fillna(0), inplace=True)

a

     a    b  c
0  1.0  5.0  5
1  2.0  0.0  1
2  0.0  6.0  5
3  0.0  0.0  2
like image 147
piRSquared Avatar answered Oct 03 '22 14:10

piRSquared


EDIT: As @piRSquared pointed out, the first solution should be

a.loc[:, ['a', 'b']] = a[['a', 'b']].fillna(0)

to fillna in selected columns

or

a.fillna(0, inplace = True)

to fillna in all the columns

like image 31
Vaishali Avatar answered Oct 03 '22 14:10

Vaishali