Why would this operation fail? For example:
a = pd.DataFrame({'a': [1,2,np.nan, np.nan],
'b': [5,np.nan,6, np.nan],
'c': [5, 1, 5, 2]})
a[['a', 'b']].fillna(0, inplace=True)
and gave me this warning:
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
But a
was still Filled with NA
s as before. However, if I call .fillna()
on each column separately, there'd be no issue. How can I fill NA
values on multiple columns in one shot?
We can use fillna() function to impute the missing values of a data frame to every column defined by a dictionary of values. The limitation of this method is that we can only use constant values to be filled.
The fillna() method replaces the NULL values with a specified value. The fillna() method returns a new DataFrame object unless the inplace parameter is set to True , in that case the fillna() method does the replacing in the original DataFrame instead.
replace() on All Columns. You can also use df. replace(np. nan,0) to replace all NaN values with zero.
These answers are guided by the fact that OP wanted an in place edit of an existing dataframe. Usually, I overwrite the existing dataframe with a new one.
pandas.DataFrame.fillna
with a dict
Pandas fillna
allows us to pass a dictionary that specifies which columns will be filled in and with what.
So this will work
a.fillna({'a': 0, 'b': 0})
a b c
0 1.0 5.0 5
1 2.0 0.0 1
2 0.0 6.0 5
3 0.0 0.0 2
With an in place edit made possible with:
a.fillna({'a': 0, 'b': 0}, inplace=True)
NOTE: I would've just done this a = a.fillna({'a': 0, 'b': 0})
We don't save text length but we could get cute using dict.fromkeys
a.fillna(dict.fromkeys(['a', 'b'], 0), inplace=True)
loc
We can use the same format as the OP but place it in the correct columns using loc
a.loc[:, ['a', 'b']] = a[['a', 'b']].fillna(0)
a
a b c
0 1.0 5.0 5
1 2.0 0.0 1
2 0.0 6.0 5
3 0.0 0.0 2
pandas.DataFrame.update
Explicitly made to make in place edits with the non-null values of another dataframe
a.update(a[['a', 'b']].fillna(0))
a
a b c
0 1.0 5.0 5
1 2.0 0.0 1
2 0.0 6.0 5
3 0.0 0.0 2
I really don't like this approach because it is unnecessarily verbose
for col in ['a', 'b']:
a[col].fillna(0, inplace=True)
a
a b c
0 1.0 5.0 5
1 2.0 0.0 1
2 0.0 6.0 5
3 0.0 0.0 2
fillna
with a dataframeUse the result of a[['a', 'b']].fillna(0)
as the input for another fillna
. In my opinion, this is silly. Just use the first option.
a.fillna(a[['a', 'b']].fillna(0), inplace=True)
a
a b c
0 1.0 5.0 5
1 2.0 0.0 1
2 0.0 6.0 5
3 0.0 0.0 2
EDIT: As @piRSquared pointed out, the first solution should be
a.loc[:, ['a', 'b']] = a[['a', 'b']].fillna(0)
to fillna in selected columns
or
a.fillna(0, inplace = True)
to fillna in all the columns
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With