I have two Dataframes in Python Pandas. Let's say that the first one is the df1
. It is not necessary that the id column is ordered.
id B C
0 1 5 1
1 1 5 1
2 1 6 1
3 1 7 1
4 2 5 1
5 2 6 1
6 2 6 1
7 3 7 1
8 3 7 1
9 4 6 1
10 4 7 1
11 4 7 1
Then the second dataframe df2
has a column with the unique values of the id
id
0 1
1 2
2 3
3 4
I want to calculate the min, max and average of column B
for each id
and add it to the second dataframe. The result would be like this:
id min max avg
0 1 5 7 5.75
1 2 ..
2 3 ..
3 4 ..
In this example, I was able to replicate it by calculating them for each id
manual. It was not a problem since the example has only 4 ids. But my real example has more than 1000 ids. Is there any automatic way to do it?
Use agg
function on groups
In [96]: df.groupby('id')['B'].agg([pd.np.min, pd.np.max, pd.np.mean])
Out[96]:
amin amax mean
id
1 5 7 5.750000
2 5 6 5.666667
3 7 7 7.000000
4 6 7 6.666667
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With