Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

aggregate values based on categories

I have a DataFrame like:

SK_ID_CURR  CREDIT_ACTIVE   CREDIT_DAY_OVERDUE

436084         Sold               0

436084         Active            951

436084         Sold               0

436084         Active             0

436084         Bad debt           0

436084         Active            936

436084         Active            951

I'd like to make new columns for each CREDIT_ACTIVE category, with the corresponding sum of values of CREDIT_DAY_OVERDUE.

The result should look like:

SK_ID_CURR  CREDIT_ACTIVE_OD  CREDIT_BAD_DEBT_OD CREDIT_ACTIVE_SOLD_OD

436084       2838                 0                 0
like image 396
hk_03 Avatar asked Mar 26 '26 06:03

hk_03


1 Answers

Use groupby and aggregate sum, last reshape by unstack:

df = (df.groupby(['SK_ID_CURR','CREDIT_ACTIVE'])['CREDIT_DAY_OVERDUE']
        .sum()
        .unstack(fill_value=0))

Or use pivot_table:

df = df.pivot_table(index='SK_ID_CURR',
                    columns='CREDIT_ACTIVE',
                    values='CREDIT_DAY_OVERDUE',
                    aggfunc='sum',
                    fill_value=0)

Then change columns names:

df.columns = ['CREDIT_{}_OD'.format(x.upper()) for x in df.columns]

And last create column from index:

df = df.reset_index()
print (df)
   SK_ID_CURR  CREDIT_ACTIVE_OD  CREDIT_BAD DEBT_OD  CREDIT_SOLD_OD
0      436084              2838                   0               0
like image 131
jezrael Avatar answered Mar 28 '26 20:03

jezrael



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!