Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Group python pandas dataframe per weeks (starting on Monday)

I have a dataframe with values per day (see df below). I want to group the "Forecast" field per week but with Monday as the first day of the week.

Currently I can do it via pd.TimeGrouper('W') (see df_final below) but it groups the week starting on Sundays (see df_final below)

import pandas as pd
data = [("W1","G1",1234,pd.to_datetime("2015-07-1"),8),
        ("W1","G1",1234,pd.to_datetime("2015-07-30"),2),
        ("W1","G1",1234,pd.to_datetime("2015-07-15"),2),
        ("W1","G1",1234,pd.to_datetime("2015-07-2"),4),
        ("W1","G2",2345,pd.to_datetime("2015-07-5"),5),
        ("W1","G2",2345,pd.to_datetime("2015-07-7"),1),
        ("W1","G2",2345,pd.to_datetime("2015-07-9"),1),
        ("W1","G2",2345,pd.to_datetime("2015-07-11"),3)]

labels = ["Site","Type","Product","Date","Forecast"]
df = pd.DataFrame(data,columns=labels).set_index(["Site","Type","Product","Date"])
df


                              Forecast
Site Type Product Date                
W1   G1   1234    2015-07-01         8
                  2015-07-30         2
                  2015-07-15         2
                  2015-07-02         4
     G2   2345    2015-07-05         5
                  2015-07-07         1
                  2015-07-09         1
                  2015-07-11         3



df_final = (df
     .reset_index()
     .set_index("Date")
     .groupby(["Site","Product",pd.TimeGrouper('W')])["Forecast"].sum()
     .astype(int)
     .reset_index())
df_final["DayOfWeek"] = df_final["Date"].dt.dayofweek
df_final

  Site  Product       Date  Forecast  DayOfWeek
0   W1     1234 2015-07-05        12          6
1   W1     1234 2015-07-19         2          6
2   W1     1234 2015-08-02         2          6
3   W1     2345 2015-07-05         5          6
4   W1     2345 2015-07-12         5          6
like image 941
Nicolas Avatar asked Oct 04 '17 10:10

Nicolas


1 Answers

Use W-MON instead W, check anchored offsets:

df_final = (df
     .reset_index()
     .set_index("Date")
     .groupby(["Site","Product",pd.Grouper(freq='W-MON')])["Forecast"].sum()
     .astype(int)
     .reset_index())

df_final["DayOfWeek"] = df_final["Date"].dt.dayofweek
print (df_final)
  Site  Product       Date  Forecast  DayOfWeek
0   W1     1234 2015-07-06        12          0
1   W1     1234 2015-07-20         2          0
2   W1     1234 2015-08-03         2          0
3   W1     2345 2015-07-06         5          0
4   W1     2345 2015-07-13         5          0
like image 147
jezrael Avatar answered Oct 03 '22 22:10

jezrael