This might be a very simple problem but I can not find the solution: I want to add a new column "col_new" with operations depending on group variables like groupIDs or dates. So depending on the groupID the calculation should change.
Example:
Year col1 col2
0 2019 10 1
1 2019 4 2
2 2019 25 1
3 2018 3 1
4 2017 56 2
5 2017 3 2
- for Year = 2017: col_new = col1-col2
- for Year = 2018: col_new = col1+col2
- for Year = 2019: col_new = col1*col2
Also I want to wrap this up in a for loop.
year = [2017, 2018, 2019]
for x in year:
df["new_col]" = ................
import pandas as pd
import numpy as np
d = {'Year': [2019, 2019, 2019, 2018, 2017, 2017],
'col1': [10, 4, 25, 3, 56, 3],
'col2': [1, 2, 1, 1, 2, 2]}
df = pd.DataFrame(data=d) #the example dataframe
df = df.set_index("Year")
print(df)
col1 col2
Year
2019 10 1
2019 4 2
2019 25 1
2018 3 1
2017 56 2
2017 3 2
Now I need something like:
- if 2017 then col1+col2
- if 2018 then col1-col2
- if 2019 then col1*col2
dict of operatorsfrom operator import sub, add, mul
op = {2019: mul, 2018: add, 2017: sub}
df.assign(new_col=[op[t.Year](t.col1, t.col2) for t in df.itertuples()])
Year col1 col2 new_col
0 2019 10 1 10
1 2019 4 2 8
2 2019 25 1 25
3 2018 3 1 4
4 2017 56 2 54
5 2017 3 2 1
If Year is in the index
df.assign(new_col=[op[t.Index](t.col1, t.col2) for t in df.itertuples()])
col1 col2 new_col
Year
2019 10 1 10
2019 4 2 8
2019 25 1 25
2018 3 1 4
2017 56 2 54
2017 3 2 1
You can use numpy.select
cond = [df.index == 2017, df.index == 2018, df.index == 2019]
choice = [df.col1+df.col2, df.col1-df.col2, df.col1*df.col2]
df['new'] = np.select(cond, choice)
col1 col2 new
Year
2019 10 1 10
2019 4 2 8
2019 25 1 25
2018 3 1 2
2017 56 2 58
2017 3 2 5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With