I'm trying to create a new pandas dataframe column by subtracting an existing dataframe column column from another. However, if the result is a negative number, the new column value should be set to zero.
import pandas as pd
data = {'A': [1,2,3], 'B': [3,2,1]}
df = pd.DataFrame(data)
In [4]: df
Out[4]:
A B
0 1 3
1 2 2
2 3 1
If I create a new dataframe column 'C' by subtracting 'B' from 'A', I get the right result.
df['C'] = df['A'] - df['B']
In[8]: df
Out[7]:
A B C
0 1 3 -2
1 2 2 0
2 3 1 2
However, if I utilize the max()
function to avoid results with a negative number, I get "ValueError: The truth value of a Series is ambiguous."
>>> df['C'] = max(df['A'] - df['B'], 0)
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
The expected output is:
A B C
0 1 3 0
1 2 2 0
2 3 1 2
What am I doing wrong?
You need to use np.maximum
to do element-wise maximum comparison:
>>> np.maximum(df['A'] - df['B'], 0)
0 0
1 0
2 2
dtype: int64
The problem is max
is that it essentially checks (df['A'] - df['B']) > 0
. This returns an array of boolean values (not a boolean), hence the error.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With