Questions Linux Laravel Mysql Ubuntu Git Menu

HTML CSS JAVASCRIPT SQL PYTHON PHP BOOTSTRAP JAVA JQUERY R React Kotlin

Pandas Groupy take only the first N Groups [duplicate]

Tags:

python

pandas

pandas-groupby

I have some DataFrame which I want to group by the ID, e. g.:

import pandas as pd
df = pd.DataFrame({'item_id': ['a', 'a', 'b', 'b', 'b', 'c', 'd'], 'user_id': [1,2,1,1,3,1,5]})
print df

Which generates:

  item_id  user_id
0       a        1
1       a        2
2       b        1
3       b        1
4       b        3
5       c        1
6       d        5

[7 rows x 2 columns]

I can easily group by the id:

grouped = df.groupby("item_id")

But how can I return only the first N group-by objects? E. g. I want only the first 3 unique item_ids.

like image

461

asked Jul 27 '15 14:07

Christian Sauer

People also ask

How do you get first group pandas?

The pandas. groupby. nth() function is used to get the value corresponding the nth row for each group. To get the first value in a group, pass 0 as an argument to the nth() function.

1 Answers

Here is one way using list(grouped).

result = [g[1] for g in list(grouped)[:3]]

# 1st
result[0]

  item_id  user_id
0       a        1
1       a        2

# 2nd
result[1]

  item_id  user_id
2       b        1
3       b        1
4       b        3

like image

120

answered Oct 13 '22 10:10

Jianxun Li

Sign in to Comment

Related questions
                            
                                AttributeError: StringIO instance has no attribute 'fileno'
                            
                                How to run own daemon processes with Django?
                            
                                Difference between int and numbers.Integral in Python
                            
                                How to get BPM and tempo audio features in Python [closed]
                            
                                parsing a complex logical expression in pyparsing in a binary tree fashion
                            
                                Python: iterate over a sublist
                            
                                get how much time python subprocess spends
                            
                                AES - Encryption with Crypto (node-js) / decryption with Pycrypto (python)
                            
                                How to use QThread correctly in pyqt with moveToThread()?
                            
                                Cannot find the file specified when using subprocess.call('dir', shell=True) in Python
                            
                                Should python-dev be required to install pip
                            
                                creating sets of tuples in python
                            
                                Why is factory_boy superior to using the ORM directly in tests?
                            
                                Why does Django South 1.0 use iteritems()?
                            
                                why is defining an object variable outside of __init__ frowned upon? [duplicate]
                            
                                Python Multiprocessing appending list
                            
                                populating matplotlib subplots through a loop and a function
                            
                                how to POST contents of JSON file to RESTFUL API with Python using requests module
                            
                                Read a binary file using Numpy fromfile and a given offset
                            
                                Selenium: How to disable image loading with firefox and python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With