pandas get minimum of one column in group when groupby another

Question

I have a pandas dataframe that looks like this:

      c     y
0     9     0
1     8     0
2     3     1
3     6     2
4     1     3
5     2     3
6     5     3
7     4     4
8     0     4
9     7     4

I'd like to groupby y and get the min and max of c so that my new dataframe would look like this:

      c     y     min   max
0     9     0     8     9
1     8     0     8     9
2     3     1     3     3   
3     6     2     6     6 
4     1     3     1     5
5     2     3     1     5
6     5     3     1     5
7     4     4     0     7
8     0     4     0     7
9     7     4     0     7

I tried using df['min'] = df.groupby(['y'])['c'].min() but that gave me some weird results. The first 175 rows were populated in the min column but then it went to NaN for all the rest. is that not how you're supposed to use the groupby method?

Zero · Accepted Answer

Option 1 Use transform

In [13]: dfc = df.groupby('y')['c']

In [14]: df.assign(min=dfc.transform(min), max=dfc.transform(max))
Out[14]:
   c  y  max  min
0  9  0    9    8
1  8  0    9    8
2  3  1    3    3
3  6  2    6    6
4  1  3    5    1
5  2  3    5    1
6  5  3    5    1
7  4  4    7    0
8  0  4    7    0
9  7  4    7    0

Or

In [15]: df['min' ] = dfc.transform('min')

In [16]: df['max' ] = dfc.transform('max')

Option 2 Use join and agg

In [30]: df.join(df.groupby('y')['c'].agg(['min', 'max']), on='y')
Out[30]:
   c  y  min  max
0  9  0    8    9
1  8  0    8    9
2  3  1    3    3
3  6  2    6    6
4  1  3    1    5
5  2  3    1    5
6  5  3    1    5
7  4  4    0    7
8  0  4    0    7
9  7  4    0    7

Option 3 Use merge and agg

In [28]: df.merge(df.groupby('y')['c'].agg(['min', 'max']), right_index=True, left_on='y')
Out[28]:
   c  y  min  max
0  9  0    8    9
1  8  0    8    9
2  3  1    3    3
3  6  2    6    6
4  1  3    1    5
5  2  3    1    5
6  5  3    1    5
7  4  4    0    7
8  0  4    0    7
9  7  4    0    7

piRSquared · Answer

With Numpy shenanigans

n = df.y.max() + 1
omax = np.ones(n, df.c.values.dtype) * df.c.values.min()
omin = np.ones(n, df.c.values.dtype) * df.c.values.max()
np.maximum.at(omax, df.y.values, df.c.values)
np.minimum.at(omin, df.y.values, df.c.values)

df.assign(min=omin[df.y], max=omax[df.y])

   c  y  min  max
0  9  0    8    9
1  8  0    8    9
2  3  1    3    3
3  6  2    6    6
4  1  3    1    5
5  2  3    1    5
6  5  3    1    5
7  4  4    0    7
8  0  4    0    7
9  7  4    0    7

pandas get minimum of one column in group when groupby another

Tags:

python

pandas

pandas-groupby

Legit Stack

2 Answers

Zero

piRSquared

Recent Activity

Donate For Us

pandas get minimum of one column in group when groupby another

Tags:

python

pandas

pandas-groupby

Legit Stack

2 Answers

Zero

piRSquared

Related questions

Recent Activity

Donate For Us