I have some DataFrame which I want to group by the ID, e. g.:
import pandas as pd
df = pd.DataFrame({'item_id': ['a', 'a', 'b', 'b', 'b', 'c', 'd'], 'user_id': [1,2,1,1,3,1,5]})
print df
Which generates:
item_id user_id
0 a 1
1 a 2
2 b 1
3 b 1
4 b 3
5 c 1
6 d 5
[7 rows x 2 columns]
I can easily group by the id:
grouped = df.groupby("item_id")
But how can I return only the first N group-by objects? E. g. I want only the first 3 unique item_ids.
The pandas. groupby. nth() function is used to get the value corresponding the nth row for each group. To get the first value in a group, pass 0 as an argument to the nth() function.
Here is one way using list(grouped)
.
result = [g[1] for g in list(grouped)[:3]]
# 1st
result[0]
item_id user_id
0 a 1
1 a 2
# 2nd
result[1]
item_id user_id
2 b 1
3 b 1
4 b 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With