Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas groupby month and year (date as datetime64[ns]) and summarized by count

I have a data frame, which I created in pandas, grouping by date and summarizing by rides.

      date   rides
0   2019-01-01  247279
1   2019-01-02  585996
2   2019-01-03  660631
3   2019-01-04  662011
4   2019-01-05  440848
..         ...     ...
451 2020-03-27  218499
452 2020-03-28  143305
453 2020-03-29  110833
454 2020-03-30  207743
455 2020-03-31  199623

[456 rows x 2 columns]

My date column is in datetime64[ns].

date     datetime64[ns]
rides             int64
dtype: object

Now I would like to create another data frame, grouping by month and year (I have data form 2019 and 2020) and summarize by rides.

Ideal output:

Year Month   Rides
2019 January 2000000
2020 March   1000000
like image 898
Anakin Skywalker Avatar asked Dec 23 '22 18:12

Anakin Skywalker


1 Answers

you can groupby and get the dt.year and the dt.month_name from the column date.

print (df.groupby([df['date'].dt.year.rename('year'), 
                   df['date'].dt.month_name().rename('month')])
         ['rides'].sum().reset_index())
   year    month    rides
0  2019  January  2596765
1  2020    March   880003
like image 53
Ben.T Avatar answered Jan 29 '23 11:01

Ben.T