Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas Timedelta in Days

I have a dataframe in pandas called 'munged_data' with two columns 'entry_date' and 'dob' which i have converted to Timestamps using pd.to_timestamp.I am trying to figure out how to calculate ages of people based on the time difference between 'entry_date' and 'dob' and to do this i need to get the difference in days between the two columns ( so that i can then do somehting like round(days/365.25). I do not seem to be able to find a way to do this using a vectorized operation. When I do munged_data.entry_date-munged_data.dob i get the following :

internal_quote_id 2                    15685977 days, 23:54:30.457856 3                    11651985 days, 23:49:15.359744 4                     9491988 days, 23:39:55.621376 7                     11907004 days, 0:10:30.196224 9                    15282164 days, 23:30:30.196224 15                  15282227 days, 23:50:40.261632   

However i do not seem to be able to extract the days as an integer so that i can continue with my calculation. Any help appreciated.

like image 963
luckyfool Avatar asked Apr 19 '13 11:04

luckyfool


People also ask

How do I convert Timedelta to days?

Converting a timedelta to days is easier, and less confusing, than seconds. According to the docs, only days, seconds and microseconds are stored internally. To get the number of days in a time delta, just use the timedelta. days .

What is Timedelta in pandas?

Timedelta. Represents a duration, the difference between two dates or times. Timedelta is the pandas equivalent of python's datetime. timedelta and is interchangeable with it in most cases.

How do I convert datetime to Timedelta in pandas?

The to_timedelta() function is used to convert argument to datetime. Timedeltas are absolute differences in times, expressed in difference units (e.g. days, hours, minutes, seconds). This method converts an argument from a recognized timedelta format / value into a Timedelta type. The data to be converted to timedelta.


1 Answers

Using the Pandas type Timedelta available since v0.15.0 you also can do:

In[1]: import pandas as pd In[2]: df = pd.DataFrame([ pd.Timestamp('20150111'),                             pd.Timestamp('20150301') ], columns=['date']) In[3]: df['today'] = pd.Timestamp('20150315') In[4]: df Out[4]:          date      today 0 2015-01-11 2015-03-15 1 2015-03-01 2015-03-15  In[5]: (df['today'] - df['date']).dt.days Out[5]:  0    63 1    14 dtype: int64 
like image 150
DanT Avatar answered Sep 21 '22 23:09

DanT