If I have the following dataframe:
date A B M S
20150101 8 7 7.5 0
20150101 10 9 9.5 -1
20150102 9 8 8.5 1
20150103 11 11 11 0
20150104 11 10 10.5 0
20150105 12 10 11 -1
...
If I want to create another column 'cost' by the following rules:
currently, I am using the following function:
def cost(df):
if df[3]<0:
return np.roll((df[2]-df[1]),1)*df[3]
elif df[3]>0:
return np.roll((df[2]-df[0]),1)*df[3]
else:
return 0
df['cost']=df.apply(cost,axis=0)
Is there any other way to do it? can I somehow use pandas shift function in user defined functions? thanks.
It's generally expensive to do it this way, as you're losing the vector speed advantage when you apply
a user defined function. Instead, how about using the numpy version of the ternary operator:
import numpy as np
np.where(df[3] < 0,
np.roll((df[2]-df[1]),1),
np.where(df[3] > 0,
np.roll((df[2]-df[0]),1)*df[3]
0))
(of course assign it to df['cost']
).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With