Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find min, max and average of an ID in Python Pandas

Tags:

python

pandas

I have two Dataframes in Python Pandas. Let's say that the first one is the df1. It is not necessary that the id column is ordered.

   id  B  C
0   1  5  1
1   1  5  1
2   1  6  1
3   1  7  1
4   2  5  1
5   2  6  1
6   2  6  1
7   3  7  1
8   3  7  1
9   4  6  1
10  4  7  1
11  4  7  1

Then the second dataframe df2 has a column with the unique values of the id

   id
0   1
1   2
2   3
3   4

I want to calculate the min, max and average of column B for each id and add it to the second dataframe. The result would be like this:

   id  min  max  avg
0   1   5    7   5.75
1   2  ..
2   3  ..
3   4  ..

In this example, I was able to replicate it by calculating them for each id manual. It was not a problem since the example has only 4 ids. But my real example has more than 1000 ids. Is there any automatic way to do it?

like image 236
Tasos Avatar asked Oct 06 '15 08:10

Tasos


1 Answers

Use agg function on groups

In [96]: df.groupby('id')['B'].agg([pd.np.min, pd.np.max, pd.np.mean])
Out[96]:
    amin  amax      mean
id
1      5     7  5.750000
2      5     6  5.666667
3      7     7  7.000000
4      6     7  6.666667
like image 173
Zero Avatar answered Nov 02 '22 17:11

Zero