Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas: how do I select first row in each GROUP BY group?

Tags:

python

pandas

Basically the same as Select first row in each GROUP BY group? only in pandas.

df = pd.DataFrame({'A' : ['foo', 'foo', 'foo', 'foo', 'bar', 'bar', 'bar', 'bar'],                 'B' : ['3', '1', '2', '4','2', '4', '1', '3'],                     }) 

Sorting looks promising:

df.sort('B')       A  B 1  foo  1 6  bar  1 2  foo  2 4  bar  2 0  foo  3 7  bar  3 3  foo  4 5  bar  4 

But then first won't give the desired result... df.groupby('A').first()

     B A      bar  2 foo  3 
like image 999
ihadanny Avatar asked May 27 '15 15:05

ihadanny


People also ask

How do you get the first row in pandas?

Select & print first row of dataframe using head() It will return the first row of dataframe as a dataframe object. Using the head() function, we fetched the first row of dataframe as a dataframe and then just printed it.

What is first () in pandas?

Pandas DataFrame first() Method The first() method returns the first n rows, based on the specified value. The index have to be dates for this method to work as expected.


1 Answers

Generally if you want your data sorted in a groupby but it's not one of the columns which are going to be grouped on then it's better to sort the df prior to performing groupby:

In [5]: df.sort_values('B').groupby('A').first()  Out[5]:      B A      bar  1 foo  1 
like image 106
EdChum Avatar answered Oct 08 '22 19:10

EdChum