How to sum and to mean one DataFrame to create another DataFrame

Question

After creating DataFrame with some duplicated cell values in the column Name:

import pandas as pd
df = pd.DataFrame({'Name': ['Will','John','John','John','Alex'],
                   'Payment':  [15, 10, 10, 10, 15],
                   'Duration':    [30, 15, 15, 15, 20]})

enter image description here

I would like to proceed by creating another DataFrame where the duplicated values in Name column are consolidated leaving no duplicates. At the same time I want to sum the payments values John made. I proceed with:

df_sum = df.groupby('Name', axis=0).sum().reset_index()

enter image description here

But since df.groupby('Name', axis=0).sum() command applies the sum function to every column in DataFrame the Duration (of the visit in minutes) column is processed as well. Instead I would like to get an average values for the Duration column. So I would need to use mean() method, like so:

df_mean = df.groupby('Name', axis=0).mean().reset_index()

enter image description here

But with mean() function the column Payment is now showing the average payment values John made and not the sum of all the payments.

How to create a DataFrame where Duration values show the average values while the Payment values show the sum?

ayhan · Accepted Answer

You can apply different functions to different columns with groupby.agg:

df.groupby('Name').agg({'Duration': 'mean', 'Payment': 'sum'})
Out: 
      Payment  Duration
Name                   
Alex       15        20
John       30        15
Will       15        30

How to sum and to mean one DataFrame to create another DataFrame

Tags:

python

pandas

dataframe

alphanumeric

1 Answers

ayhan

Recent Activity

Donate For Us

How to sum and to mean one DataFrame to create another DataFrame

Tags:

python

pandas

dataframe

alphanumeric

1 Answers

ayhan

Related questions

Recent Activity

Donate For Us