Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas Groupy take only the first N Groups [duplicate]

I have some DataFrame which I want to group by the ID, e. g.:

import pandas as pd
df = pd.DataFrame({'item_id': ['a', 'a', 'b', 'b', 'b', 'c', 'd'], 'user_id': [1,2,1,1,3,1,5]})
print df

Which generates:

  item_id  user_id
0       a        1
1       a        2
2       b        1
3       b        1
4       b        3
5       c        1
6       d        5

[7 rows x 2 columns]

I can easily group by the id:

grouped = df.groupby("item_id")

But how can I return only the first N group-by objects? E. g. I want only the first 3 unique item_ids.

like image 461
Christian Sauer Avatar asked Jul 27 '15 14:07

Christian Sauer


People also ask

How do you get first group pandas?

The pandas. groupby. nth() function is used to get the value corresponding the nth row for each group. To get the first value in a group, pass 0 as an argument to the nth() function.


1 Answers

Here is one way using list(grouped).

result = [g[1] for g in list(grouped)[:3]]

# 1st
result[0]

  item_id  user_id
0       a        1
1       a        2

# 2nd
result[1]

  item_id  user_id
2       b        1
3       b        1
4       b        3
like image 120
Jianxun Li Avatar answered Oct 13 '22 10:10

Jianxun Li