i have a pandas dataframe <pre class="prettyprint"><code>df.columns Index([u’car_id’,u’color’,u’make’,u’year’)] </code></pre> I would like to create a new FILTERABLE object that has the count of each group (color,make,year); <pre class="prettyprint"><code>grp = df[[‘color’,’make’,’year’]].groupby([‘color’,’make’,’year’]).size() </code></pre> which will return something like this <pre class="prettyprint"><code>color make year count black honda 2011 416 </code></pre> I would like to be able to filter it, however when I try this: <pre class="prettyprint"><code>grp.filter(lambda x: x[‘color’]==‘black’) </code></pre> I receive this error <blockquote> TypeError: 'function' object is not iterable </blockquote> How do I leverage a 'groupby' object in order to filter the rows out?

I think you need add <code>reset_index</code> and then output is <code>DataFrame</code>. Last use <code>boolean indexing</code>: <pre class="prettyprint"><code>df = df[['color','make','year']].groupby(['color','make','year']) .size() .reset_index(name='count') df1 = df[df.color == 'black'] </code></pre>

Option 1 Filter ahead of time <pre class="prettyprint"><code>cols = ['color','make','year'] df[df.color == 'black', cols].grouby(cols).size() </code></pre> Option 2 Use <code>xs</code> for index cross sections <pre class="prettyprint"><code>cols = ['color','make','year'] grp = df[cols].groupby(cols).size() df.xs('black', level='color', drop_level=False) </code></pre> or <pre class="prettyprint"><code>df.xs('honda', level='make', drop_level=False) </code></pre> or <pre class="prettyprint"><code>df.xs(2011, level='year', drop_level=False) </code></pre>

Pandas groupby object filtering

df.columns
Index([u’car_id’,u’color’,u’make’,u’year’)]

I would like to create a new FILTERABLE object that has the count of each group (color,make,year);

grp = df[[‘color’,’make’,’year’]].groupby([‘color’,’make’,’year’]).size()

which will return something like this

color   make   year     count
black   honda  2011   416

I would like to be able to filter it, however when I try this:

grp.filter(lambda x: x[‘color’]==‘black’)

I receive this error

TypeError: 'function' object is not iterable

How do I leverage a 'groupby' object in order to filter the rows out?

794

asked Sep 12 '16 19:09

chattrat423

2 Answers

I think you need add reset_index and then output is DataFrame. Last use boolean indexing:

df = df[['color','make','year']].groupby(['color','make','year'])
                                .size()
                                .reset_index(name='count')


df1 = df[df.color == 'black']

183

answered Oct 21 '22 21:10

jezrael

Option 1
Filter ahead of time

cols = ['color','make','year']
df[df.color == 'black', cols].grouby(cols).size()

Option 2 Use xs for index cross sections

cols = ['color','make','year']
grp = df[cols].groupby(cols).size()

df.xs('black', level='color', drop_level=False)

df.xs('honda', level='make', drop_level=False)

df.xs(2011, level='year', drop_level=False)

answered Oct 21 '22 19:10

piRSquared

Related questions
                            
                                How to use pip in Windows? [duplicate]
                            
                                How to change plot properties of statsmodels qqplot? (Python)
                            
                                PySpark count values by condition
                            
                                Paho Python MQTT client connects successfully but on_connect callback is not invoked
                            
                                Query Embedded Document List in MongoEngine
                            
                                pyQt: How do I update a label?
                            
                                Merging two dataframes with same column names but different number of columns in pandas
                            
                                Network capturing with Selenium/PhantomJS
                            
                                Python requests and Json for loop
                            
                                Python ldap3 LDAPSocketOpenError unable to send message, socket is not open
                            
                                Convert 3d Numpy array to 2d
                            
                                Custom Python gTTS voice
                            
                                Single worker thread for all tasks or multiple specific workers?
                            
                                How to remove the adjacent duplicate value in a numpy array?
                            
                                Appending more datasets into an existing Hdf5 file without deleting other groups and datasets
                            
                                What effect do the different URL parameters of the Sphinx HTML output's search feature have?
                            
                                multi_line hover in bokeh
                            
                                Set PYTHONPATH for cron jobs in shared hosting
                            
                                Spoofing IP address when web scraping (python)
                            
                                Ordering users by date created in django admin panel

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas groupby object filtering

Tags:

python

indexing

pandas

conditional-statements

group-by

chattrat423

People also ask

2 Answers

jezrael

piRSquared

Recent Activity

Donate For Us