Pandas: How to calculate new column based on index or groupID?

Question

This might be a very simple problem but I can not find the solution: I want to add a new column "col_new" with operations depending on group variables like groupIDs or dates. So depending on the groupID the calculation should change.
Example:

   Year  col1  col2
0  2019    10     1
1  2019     4     2
2  2019    25     1
3  2018     3     1
4  2017    56     2
5  2017     3     2

- for Year = 2017: col_new = col1-col2
- for Year = 2018: col_new = col1+col2
- for Year = 2019: col_new = col1*col2
Also I want to wrap this up in a for loop.

year = [2017, 2018, 2019]
for x in year:
    df["new_col]" = ................

tried using if-functions <== allways requires an else so it changes all values of the previous iteration
using .loc and it works but becomes very hard to handle with long and complex conditions
tried setting index for column Year. This is easy doing but then I am stuck.

import pandas as pd
import numpy as np

d = {'Year': [2019, 2019, 2019, 2018, 2017, 2017],
     'col1': [10, 4, 25, 3, 56, 3],
     'col2': [1, 2, 1, 1, 2, 2]}
df = pd.DataFrame(data=d) #the example dataframe
df = df.set_index("Year")
print(df)

      col1  col2
Year            
2019    10     1
2019     4     2
2019    25     1
2018     3     1
2017    56     2
2017     3     2

Now I need something like:
- if 2017 then col1+col2
- if 2018 then col1-col2
- if 2019 then col1*col2

piRSquared · Accepted Answer

`dict` of operators

from operator import sub, add, mul

op = {2019: mul, 2018: add, 2017: sub}

df.assign(new_col=[op[t.Year](t.col1, t.col2) for t in df.itertuples()])

   Year  col1  col2  new_col
0  2019    10     1       10
1  2019     4     2        8
2  2019    25     1       25
3  2018     3     1        4
4  2017    56     2       54
5  2017     3     2        1

If Year is in the index

df.assign(new_col=[op[t.Index](t.col1, t.col2) for t in df.itertuples()])

      col1  col2  new_col
Year                     
2019    10     1       10
2019     4     2        8
2019    25     1       25
2018     3     1        4
2017    56     2       54
2017     3     2        1

Vaishali · Answer

You can use numpy.select

cond = [df.index == 2017, df.index == 2018, df.index == 2019]
choice = [df.col1+df.col2, df.col1-df.col2, df.col1*df.col2]
df['new'] = np.select(cond, choice)



       col1 col2    new
Year            
2019    10  1       10
2019    4   2       8
2019    25  1       25
2018    3   1       2
2017    56  2       58
2017    3   2       5

Pandas: How to calculate new column based on index or groupID?

Tags:

indexing

pandas

calculated-columns

Martin Flower

2 Answers

`dict` of operators

piRSquared

Vaishali

Recent Activity

Donate For Us

Pandas: How to calculate new column based on index or groupID?

Tags:

indexing

pandas

calculated-columns

Martin Flower

2 Answers

dict of operators

piRSquared

Vaishali

Related questions

Recent Activity

Donate For Us

`dict` of operators