Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to add a year to a column of dates in pandas

I am attempting to add a year to a column of dates in a pandas dataframe, but when I use pd.to_timedelta I get additional hours & minutes. I know I could take the updated time and truncate the hours, but I feel like there must be a way to add a year precisely. My attempt as follows:

import pandas as pd
dates = pd.DataFrame({'date':['20170101','20170102','20170103']})
dates['date'] = pd.to_datetime(dates['date'], format='%Y%m%d')
dates['date2'] = dates['date'] +  pd.to_timedelta(1, unit='y')
dates

yields:

Out[1]: 
    date        date2
0   2017-01-01  2018-01-01 05:49:12
1   2017-01-02  2018-01-02 05:49:12
2   2017-01-03  2018-01-03 05:49:12

How can I add a year without adding 05:49:12 HH:mm:ss?

like image 777
Pdubbs Avatar asked Sep 12 '25 16:09

Pdubbs


2 Answers

In [99]: dates['date'] + pd.offsets.DateOffset(years=1)
Out[99]:
0   2018-01-01
1   2018-01-02
2   2018-01-03
Name: date, dtype: datetime64[ns]

leap year check:

In [100]: pd.to_datetime(['2011-02-28', '2012-02-29']) + pd.offsets.DateOffset(years=1)
Out[100]: DatetimeIndex(['2012-02-28', '2013-02-28'], dtype='datetime64[ns]', freq=None)
like image 53
MaxU - stop WAR against UA Avatar answered Sep 15 '25 06:09

MaxU - stop WAR against UA


You can normalize via pd.Series.dt.normalize:

dates['date2'] = (dates['date'] +  pd.to_timedelta(1, unit='y')).dt.normalize()
like image 21
jpp Avatar answered Sep 15 '25 06:09

jpp