How can I calculate the elapsed months using pandas? I have write the following, but this code is not elegant. Could you tell me a better way?
import pandas as pd df = pd.DataFrame([pd.Timestamp('20161011'), pd.Timestamp('20161101') ], columns=['date']) df['today'] = pd.Timestamp('20161202') df = df.assign( elapsed_months=(12 * (df["today"].map(lambda x: x.year) - df["date"].map(lambda x: x.year)) + (df["today"].map(lambda x: x.month) - df["date"].map(lambda x: x.month)))) # Out[34]: # date today elapsed_months # 0 2016-10-11 2016-12-02 2 # 1 2016-11-01 2016-12-02 1
Timestamp() function converts DateTime-like, str, int, or float time object to timestamp. Then we extract year and month values from the timestamps. as each year has 12 months we multiply 12 with the year difference and add the month difference.
Use df. dates1-df. dates2 to find the difference between the two dates and then convert the result in the form of months.
Timedelta. Represents a duration, the difference between two dates or times. Timedelta is the pandas equivalent of python's datetime. timedelta and is interchangeable with it in most cases.
Converting a timedelta to days is easier, and less confusing, than seconds. According to the docs, only days, seconds and microseconds are stored internally. To get the number of days in a time delta, just use the timedelta. days .
Update for pandas 0.24.0:
Since 0.24.0 has changed the api to return MonthEnd object from period subtraction, you could do some manual calculation as follows to get the whole month difference:
12 * (df.today.dt.year - df.date.dt.year) + (df.today.dt.month - df.date.dt.month) # 0 2 # 1 1 # dtype: int64
Wrap in a function:
def month_diff(a, b): return 12 * (a.dt.year - b.dt.year) + (a.dt.month - b.dt.month) month_diff(df.today, df.date) # 0 2 # 1 1 # dtype: int64
Prior to pandas 0.24.0. You can round the date to Month with to_period()
and then subtract the result:
df['elapased_months'] = df.today.dt.to_period('M') - df.date.dt.to_period('M') df # date today elapased_months #0 2016-10-11 2016-12-02 2 #1 2016-11-01 2016-12-02 1
you could also try:
df['months'] = (df['today'] - df['date']) / np.timedelta64(1, 'M') df # date today months #0 2016-10-11 2016-12-02 1.708454 #1 2016-11-01 2016-12-02 1.018501
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With