I would like to create a new column in my dataset, which is a difference in years between today and a another column already in the dataset, filled up with dates.
the code above:
df['diff_years'] = datetime.today() - df['some_date']
df['diff_years']
give me the following output (exemple):
1754 days 11:44:28.971615
and i have to get something like (meaning the output above in years):
4,8
(or 5)
I appreciate any help!
PS.: i would like to avoid looping the series, path i believe would give me a desired solution, but due having a big series i would like to avoid this way.
Create a relativedelta object that represents the interval between two given dates. Use the relativedelta(end_date, start_date) function of a dateutil module to create a relativedelta object. Use the relativedelta. years attribute to get years.
timedelta() method. To find the difference between two dates in Python, one can use the timedelta class which is present in the datetime library. The timedelta class stores the difference between two datetime objects.
datetime. now() method contains the year, month, day, hour, minute, second, and microsecond (expressed as YYYY-MM-DD hh:mm:ss. ffffff ). It also accepts an optional time_zone parameter, which is set to None by default.
Just subtract one from the other. You get a timedelta object with the difference. Save this answer.
Here is one way:
import pandas as pd, numpy as np
df = pd.DataFrame({'date': ['2009-06-15 00:00:00']})
df['years'] = (pd.to_datetime('now') - pd.to_datetime(df['date'])) / np.timedelta64(1, 'Y')
# date years
# 0 2009-06-15 00:00:00 8.713745
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With