How can I apply a function element-wise to a pandas DataFrame and pass a column-wise calculated value (e.g. quantile of column)? For example, what if I want to replace all elements in a DataFrame (with NaN
) where the value is lower than the 80th percentile of the column?
def _deletevalues(x, quantile):
if x < quantile:
return np.nan
else:
return x
df.applymap(lambda x: _deletevalues(x, x.quantile(0.8)))
Using applymap
only allows one to access each value individually and throws (of course) an AttributeError: ("'float' object has no attribute 'quantile'
Thank you in advance.
You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.
Suppose that you want to replace multiple values with multiple new values for an individual DataFrame column. In that case, you may use this template: df['column name'] = df['column name']. replace(['1st old value','2nd old value',...],['1st new value','2nd new value',...])
Use DataFrame.mask
:
df = df.mask(df < df.quantile())
print (df)
a b c
0 NaN 7.0 NaN
1 NaN NaN 6.0
2 NaN NaN 5.0
3 8.0 NaN NaN
4 7.0 3.0 5.0
5 6.0 7.0 NaN
6 NaN NaN NaN
7 8.0 4.0 NaN
8 NaN NaN 6.0
9 7.0 7.0 6.0
In [139]: df
Out[139]:
a b c
0 1 7 3
1 1 2 6
2 3 0 5
3 8 2 1
4 7 3 5
5 6 7 2
6 0 2 1
7 8 4 1
8 5 0 6
9 7 7 6
for all columns:
In [145]: df.apply(lambda x: np.where(x < x.quantile(),np.nan,x))
Out[145]:
a b c
0 NaN 7.0 NaN
1 NaN NaN 6.0
2 NaN NaN 5.0
3 8.0 NaN NaN
4 7.0 3.0 5.0
5 6.0 7.0 NaN
6 NaN NaN NaN
7 8.0 4.0 NaN
8 NaN NaN 6.0
9 7.0 7.0 6.0
or
In [149]: df[df < df.quantile()] = np.nan
In [150]: df
Out[150]:
a b c
0 NaN 7.0 NaN
1 NaN NaN 6.0
2 NaN NaN 5.0
3 8.0 NaN NaN
4 7.0 3.0 5.0
5 6.0 7.0 NaN
6 NaN NaN NaN
7 8.0 4.0 NaN
8 NaN NaN 6.0
9 7.0 7.0 6.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With