Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I Group By Month, Day from a Date field using Python/Pandas

Tags:

python

pandas

I have a Data-frame df which is as follows:

| date      | Revenue | Cost |
|-----------|---------|------|
| 6/1/2017  | 100     | 20   |
| 5/21/2017 | 200     | 40   |
| 5/21/2017 | 300     | 60   |
| 6/20/2017 | 400     | 80   |
| 6/1/2017  | 500     | 100  |

I need to group the above data by Month and then by Day to get output as:

| Month | Day | SUM(Revenue) | SUM(Cost) |
|-------|-----|--------------|-----------|
| May   | 21  | 500          | 100       |
| June  | 1   | 600          | 120       |
| June  | 20  | 400          | 80        |

I tried this code but it did not work:

df.groupby(month('date'), day('date')).agg({'Revenue': 'sum', 'Cost': 'sum' })

I want to only use Pandas or Numpy and no additional libraries

like image 730
Symphony Avatar asked Jul 04 '17 19:07

Symphony


1 Answers

Let's use set_index and sum with argument level:

df['date'] = pd.to_datetime(df['date'])
df['Month'] = df['date'].dt.strftime('%b')
df['Day'] = df['date'].dt.day   
df.set_index(['Month','Day']).sum(level=[0,1]).reset_index()

Output:

  Month  Day  Revenue  Cost
0   Jun    1      600   120
1   Jun   20      400    80
2   May   21      500   100
like image 67
Scott Boston Avatar answered Nov 15 '22 12:11

Scott Boston