How to create a calendar table (date dimension) in pandas

Tags:

pandas

A table of dates with primary keys is sometimes used in databse design.

| date_id |     Date       |    Record_timestamp |  Day      |  Week |  Month |     Quarter |   Year_half |     Year |
|---------+----------------+---------------------+-----------+-------+--------+-------------+-------------+----------|
|       0 |     2000-01-01 |    NaN              |  Saturday |  52   |  1     |     1       |   1         |     2000 |
|       1 |     2000-01-02 |    NaN              |  Sunday   |  52   |  1     |     1       |   1         |     2000 |
|       2 |     2000-01-03 |    NaN              |  Monday   |  1    |  1     |     1       |   1         |     2000 |

How to do it in pandas?

523

asked Nov 07 '17 05:11

3 Answers

This is a little cleaner with the dt accessor:

In [11]: def create_date_table2(start='2000-01-01', end='2050-12-31'):
    ...:     df = pd.DataFrame({"Date": pd.date_range(start, end)})
    ...:     df["Day"] = df.Date.dt.weekday_name
    ...:     df["Week"] = df.Date.dt.weekofyear
    ...:     df["Quarter"] = df.Date.dt.quarter
    ...:     df["Year"] = df.Date.dt.year
    ...:     df["Year_half"] = (df.Quarter + 1) // 2
    ...:     return df

In [12]: create_date_table2().head()
Out[12]:
        Date        Day  Week  Quarter  Year  Year_half
0 2000-01-01   Saturday    52        1  2000          1
1 2000-01-02     Sunday    52        1  2000          1
2 2000-01-03     Monday     1        1  2000          1
3 2000-01-04    Tuesday     1        1  2000          1
4 2000-01-05  Wednesday     1        1  2000          1

In [13]: create_date_table2().tail()
Out[13]:
            Date        Day  Week  Quarter  Year  Year_half
18623 2050-12-27    Tuesday    52        4  2050          2
18624 2050-12-28  Wednesday    52        4  2050          2
18625 2050-12-29   Thursday    52        4  2050          2
18626 2050-12-30     Friday    52        4  2050          2
18627 2050-12-31   Saturday    52        4  2050          2

Note: you may like to calculate these on the fly rather than store them as columns!

172

answered Oct 23 '22 08:10

I liked Andy and Robin's approaches and modified their create_date_tables slightly for my needs in case you are interested in having a determinisitic date_id. I find this helpful so that in other future ETL processes, given a date, won't need to worry about extra look-up steps.

def create_date_table3(start='1990-01-01', end='2080-12-31'):
   df = pd.DataFrame({"date": pd.date_range(start, end)})
   df["week_day"] = df.date.dt.weekday_name
   df["day"] = df.date.dt.day
   df["month"] = df.date.dt.month
   df["week"] = df.date.dt.weekofyear
   df["quarter"] = df.date.dt.quarter
   df["year"] = df.date.dt.year
   df.insert(0, 'date_id', (df.year.astype(str) + df.month.astype(str).str.zfill(2) + df.day.astype(str).str.zfill(2)).astype(int))
   return df

answered Oct 23 '22 09:10

Jon

Use this function

def create_date_table(start='2000-01-01', end='2050-12-31'):
    start_ts = pd.to_datetime(start).date()

    end_ts = pd.to_datetime(end).date()

    # record timetsamp is empty for now
    dates =  pd.DataFrame(columns=['Record_timestamp'],
        index=pd.date_range(start_ts, end_ts))
    dates.index.name = 'Date'

    days_names = {
        i: name
        for i, name
        in enumerate(['Monday', 'Tuesday', 'Wednesday',
                      'Thursday', 'Friday', 'Saturday', 
                      'Sunday'])
    }

    dates['Day'] = dates.index.dayofweek.map(days_names.get)
    dates['Week'] = dates.index.week
    dates['Month'] = dates.index.month
    dates['Quarter'] = dates.index.quarter
    dates['Year_half'] = dates.index.month.map(lambda mth: 1 if mth <7 else 2)
    dates['Year'] = dates.index.year
    dates.reset_index(inplace=True)
    dates.index.name = 'date_id'
    return dates

answered Oct 23 '22 08:10

redacted

Related questions
                            
                                Pandas plot bar order categories
                            
                                Create a new column only if values differ
                            
                                change specific values in dataframe if one cell in a row is null
                            
                                How to convert from pandas.DatetimeIndex to numpy.datetime64?
                            
                                Sample two pandas dataframes the same way
                            
                                pandas dataframe as field in django
                            
                                Boolean comparison of two Series objects
                            
                                COUNTIF in pandas python over multiple columns with multiple conditions
                            
                                Python pandas : pd.options.display.mpl_style = 'default' causes graph crash
                            
                                Pandas to_sql with sqlAlchemy duplicate entries error in mysqldb
                            
                                How to construct pandas dataframe from series of arrays
                            
                                Fast alternative to run a numpy based function over all the rows in Pandas DataFrame
                            
                                How to remove carriage return in a dataframe
                            
                                Replace specific values in a dataframe column using Pandas
                            
                                Pandas: Cumulative return function
                            
                                Remove double space and replace with a single one in pandas
                            
                                Use temp table with SQLAlchemy
                            
                                Combine 2 pandas dataframes according to boolean Vector
                            
                                Pandas groupby custom function to each series
                            
                                How can I compute the absolute sum with a groupby in pandas?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to create a calendar table (date dimension) in pandas

Tags:

pandas

redacted

People also ask

3 Answers

Andy Hayden

Jon

redacted

Recent Activity

Donate For Us