I have a dataset with personal data such as name, height, weight and date of birth. I would build a graph with the number of people born in a particular month and year. I'm using python pandas to accomplish this and my strategy was to try to group by year and month and add using count. But the closest I got is to get the count of people by year or by month but not by both.
df['birthdate'].groupby(df.birthdate.dt.year).agg('count')
Other questions in stackoverflow point to a Grouper called TimeGrouper but searching in pandas documentation found nothing. Any idea?
Use count() by Column Name Use pandas DataFrame. groupby() to group the rows by column and use count() method to get the count for each group by ignoring None and Nan values. It works with non-floating type data as well.
To group on multiple criteria, pass a list of the columns or criteria:
df['birthdate'].groupby([df.birthdate.dt.year, df.birthdate.dt.month]).agg('count')
Example:
In [165]: df = pd.DataFrame({'birthdate':pd.date_range(start=dt.datetime(2015,12,20),end=dt.datetime(2016,3,1))}) df.groupby([df['birthdate'].dt.year, df['birthdate'].dt.month]).agg({'count'}) Out[165]: birthdate count birthdate birthdate 2015 12 12 2016 1 31 2 29 3 1
UPDATE
As of version 0.23.0
the above code no longer works due to the restriction that multi-index level names must be unique, you now need to rename
the levels in order for this to work:
In[107]: df.groupby([df['birthdate'].dt.year.rename('year'), df['birthdate'].dt.month.rename('month')]).agg({'count'}) Out[107]: birthdate count year month 2015 12 12 2016 1 31 2 29 3 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With