i have a pandas dataframe
df.columns
Index([u’car_id’,u’color’,u’make’,u’year’)]
I would like to create a new FILTERABLE object that has the count of each group (color,make,year);
grp = df[[‘color’,’make’,’year’]].groupby([‘color’,’make’,’year’]).size()
which will return something like this
color make year count
black honda 2011 416
I would like to be able to filter it, however when I try this:
grp.filter(lambda x: x[‘color’]==‘black’)
I receive this error
TypeError: 'function' object is not iterable
How do I leverage a 'groupby' object in order to filter the rows out?
GROUP BY enables you to use aggregate functions on groups of data returned from a query. FILTER is a modifier used on an aggregate function to limit the values used in an aggregation. All the columns in the select statement that aren't aggregated should be specified in a GROUP BY clause in the query.
If you want to get a single value for each group, use aggregate() (or one of its shortcuts). If you want to get a subset of the original rows, use filter() . And if you want to get a new value for each original row, use transpose() .
groupby() to Iterate over Data frame Groups. DataFrame. groupby() function in Python is used to split the data into groups based on some criteria.
I think you need add reset_index
and then output is DataFrame
. Last use boolean indexing
:
df = df[['color','make','year']].groupby(['color','make','year'])
.size()
.reset_index(name='count')
df1 = df[df.color == 'black']
Option 1
Filter ahead of time
cols = ['color','make','year']
df[df.color == 'black', cols].grouby(cols).size()
Option 2
Use xs
for index cross sections
cols = ['color','make','year']
grp = df[cols].groupby(cols).size()
df.xs('black', level='color', drop_level=False)
or
df.xs('honda', level='make', drop_level=False)
or
df.xs(2011, level='year', drop_level=False)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With