Evening Chaps, hopefully, this question is better than my first one earlier this year which got -7! (of which I was actually grateful as it helped highlight my ignorance)
What I'm trying to achieve is to write a cunning line of code, that I can call in any dataframe I work in to get the correct week number or day of week.
for now, please consider the following dataframe:
import pandas as pd
import numpy as np
days = pd.date_range('01/01/2018', '01/04/2019', freq='D')
df = pd.DataFrame({'Date': days})
print(df.head(5))
Date
0 2018-01-01
1 2018-01-02
2 2018-01-03
3 2018-01-04
4 2018-01-05
Now, I want to create a company week number by using the following line of code:
please note that my company uses financial calender so April is Week 1 but day 1 is a Saturday (the astute amongst you have probably know my issue already)! so for 2018, the 31/03/18 is actually my Day 1 Week 1 Financial Year 2019 as its a Saturday and we can only have 53 max Weeks.
df['Week'] = np.where(df['Date'].dt.month >= 4, (df['Date'] + pd.Timedelta(days=2)).dt.week - 13,
(df['Date'] + pd.Timedelta(days=2)).dt.week + 39)
print(df)
Date Week
0 2018-01-31 44
1 2018-02-01 44
2 2018-02-02 44
all good so far right?
so let us inspect December
dec = df.Date.dt.month == 12
print(df.loc[dec].tail(5))
Date Week
330 2018-12-27 39
331 2018-12-28 39
332 2018-12-29 -12
333 2018-12-30 -12
334 2018-12-31 -12
The issue, if I understand this is that the pd.Timedelta is attempting to correctly place my week number but because of my initial formula of month >= 4
it gets placed as -12
additionally, March the 31 should be in Week 1 and not Week 53 as its a Saturday.
assuming that this is a very common task that many people may have worked on, does anyone have a method or manner to deal with this in a pythonic manner.
my own solution in my head is to manually create a dataframe and fix the Week, Day and Fisical Years manually, place in a noSQL or SQL dB and call that in each dF and merge the Week Number into my reporting.
You can create a date table in your model by leveraging the CALENDAR or CALENDARAUTO DAX functions. Each function returns a date table with a single column. The generated table may then be expanded with calculated columns to meet your date interval filtering and grouping needs.
IIUC, make use of fiscalyear
as input for your range;
from fiscalyear import *
a = fiscalyear.FiscalYear(2018)
fiscalyear.START_MONTH = 4
days = pd.date_range(a.start, a.end, freq='D')
df = pd.DataFrame({'Date': days})
Then use the code you wrote yourself for assigning weeknumbers;
df['Week'] = np.where(df['Date'].dt.month >= 4, (df['Date'] + pd.Timedelta(days=2)).dt.week - 13,
(df['Date'] + pd.Timedelta(days=2)).dt.week + 39)
Only way I can think of is changing the values of those three days at the end of december;
df['Week'] = np.where(df['Week'] <= 0, 40, df['Week'])
print(df.iloc[270:280])
Date Week
270 2018-12-27 39
271 2018-12-28 39
272 2018-12-29 40
273 2018-12-30 40
274 2018-12-31 40
275 2019-01-01 40
276 2019-01-02 40
277 2019-01-03 40
278 2019-01-04 40
279 2019-01-05 41
Note; regarding the week numbers for 30 and 31 march I would do the same:
df['Week'] = np.where(df['Week'] > 52, 1, df['Week'])
print(df.iloc[358:365])
Date Week
358 2019-03-25 52
359 2019-03-26 52
360 2019-03-27 52
361 2019-03-28 52
362 2019-03-29 52
363 2019-03-30 1
364 2019-03-31 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With