Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: how to create a year-week variable?

I have a dataframe with datetimes

dates = pd.date_range('9/25/2010', periods=10, freq='D')

df = pd.DataFrame({'col':dates})
df['col']=pd.to_datetime(df['col'])
df['dow'] = df.col.dt.dayofweek
df['week'] = df.col.dt.to_period('W')
df['week_alt']=df.col.dt.year.astype(str) + '-w' + df.col.dt.week.astype(str)

df
Out[21]: 
         col  dow                  week  week_alt
0 2010-09-25    5 2010-09-20/2010-09-26  2010-w38
1 2010-09-26    6 2010-09-20/2010-09-26  2010-w38
2 2010-09-27    0 2010-09-27/2010-10-03  2010-w39
3 2010-09-28    1 2010-09-27/2010-10-03  2010-w39
4 2010-09-29    2 2010-09-27/2010-10-03  2010-w39
5 2010-09-30    3 2010-09-27/2010-10-03  2010-w39
6 2010-10-01    4 2010-09-27/2010-10-03  2010-w39
7 2010-10-02    5 2010-09-27/2010-10-03  2010-w39
8 2010-10-03    6 2010-09-27/2010-10-03  2010-w39
9 2010-10-04    0 2010-10-04/2010-10-10  2010-w40

Here you can see that a week starts on Monday and ends on Sunday.

I would like to have control over when a week starts. For instance, if weeks now start on Sunday instead, then 2010-09-26 would be 2010-w39 and 2010-10-03 be 2010-w40.

How can I do that in Pandas?

like image 827
ℕʘʘḆḽḘ Avatar asked Dec 15 '22 04:12

ℕʘʘḆḽḘ


1 Answers

UPDATE: you can choose between these three UNIX modifiers: %U,%V,%W:

%U week number of year, with Sunday as first day of week (00..53).

%V ISO week number, with Monday as first day of week (01..53).

%W week number of year, with Monday as first day of week (00..53).

In [189]: df.col.dt.strftime('%U-%V-%W')
Out[189]:
0    38-38-38
1    39-38-38
2    39-39-39
3    39-39-39
4    39-39-39
5    39-39-39
6    39-39-39
7    39-39-39
8    40-39-39
9    40-40-40
Name: col, dtype: object

%U week number of year, with Sunday as first day of week (00..53).

In [190]: df.col.dt.strftime('%Y-w%U')
Out[190]:
0    2010-w38
1    2010-w39
2    2010-w39
3    2010-w39
4    2010-w39
5    2010-w39
6    2010-w39
7    2010-w39
8    2010-w40
9    2010-w40
Name: col, dtype: object

%V ISO week number, with Monday as first day of week (01..53).

In [191]: df.col.dt.strftime('%Y-w%V')
Out[191]:
0    2010-w38
1    2010-w38
2    2010-w39
3    2010-w39
4    2010-w39
5    2010-w39
6    2010-w39
7    2010-w39
8    2010-w39
9    2010-w40
Name: col, dtype: object
like image 192
MaxU - stop WAR against UA Avatar answered Dec 28 '22 11:12

MaxU - stop WAR against UA