Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

From TimeDelta to float days in Pandas

Tags:

I have a TimeDelta column with values that look like this:

2 days 21:54:00.000000000

I would like to have a float representing the number of days, let's say here 2+21/24 = 2.875, neglecting the minutes. Is there a simple way to do this ? I saw an answer suggesting

res['Ecart_lacher_collecte'].apply(lambda x: float(x.item().days+x.item().hours/24.)) 

But I get "AttributeError: 'str' object has no attribute 'item' "

Numpy version is '1.10.4' Pandas version is u'0.17.1'

The columns has originally been obtained with:

lac['DateHeureLacher'] = pd.to_datetime(lac['Date lacher']+' '+lac['Heure lacher'],format='%d/%m/%Y %H:%M:%S') cap['DateCollecte'] = pd.to_datetime(cap['Date de collecte']+' '+cap['Heure de collecte'],format='%d/%m/%Y %H:%M:%S') 

in a first script. Then in a second one:

res = pd.merge(lac, cap, how='inner', on=['Loc']) res['DateHeureLacher']  = pd.to_datetime(res['DateHeureLacher'],format='%Y-%m-%d %H:%M:%S') res['DateCollecte']  = pd.to_datetime(res['DateCollecte'],format='%Y-%m-%d %H:%M:%S') res['Ecart_lacher_collecte'] = res['DateCollecte'] - res['DateHeureLacher'] 

Maybe saving it to csv change their types back to string? The transformation I'm trying to do is in a third script.

Sexe_x  PiegeLacher latL    longL   Loc Col_x   DateHeureLacher Nb envolees PiegeCapture    latC    longC   Col_y   Sexe_y  Effectif    DateCollecte    DatePose    Ecart_lacher_collecte   Dist_m M   Q0-002  1629238 237877  H   Rouge   2011-02-04 17:15:00 928 Q0-002  1629238 237877  Rouge   M   1   2011-02-07 15:09:00 2011-02-07 12:14:00 2 days 21:54:00.000000000   0 M   Q0-002  1629238 237877  H   Rouge   2011-02-04 17:15:00 928 Q0-002  1629238 237877  Rouge   M   4   2011-02-07 12:14:00 2011-02-07 09:42:00 2 days 18:59:00.000000000   0 M   Q0-002  1629238 237877  H   Rouge   2011-02-04 17:15:00 928 Q0-003  1629244 237950  Rouge   M   1   2011-02-07 15:10:00 2011-02-07 12:16:00 2 days 21:55:00.000000000   75 

res.info():

Sexe_x                   922 non-null object PiegeLacher              922 non-null object latL                     922 non-null int64 longL                    922 non-null int64 Loc                      922 non-null object Col_x                    922 non-null object DateHeureLacher          922 non-null object Nb envolees              922 non-null int64 PiegeCapture             922 non-null object latC                     922 non-null int64 longC                    922 non-null int64 Col_y                    922 non-null object Sexe_y                   922 non-null object Effectif                 922 non-null int64 DateCollecte             922 non-null object DatePose                 922 non-null object Ecart_lacher_collecte    922 non-null object Dist_m                   922 non-null int64 
like image 514
alpagarou Avatar asked Feb 19 '16 10:02

alpagarou


People also ask

How do you convert Timedelta to float?

from datetime import timedelta,datetime x1= timedelta(seconds=40, minutes=40, hours=5) x2= timedelta( seconds=50, minutes=20, hours=4) x3=x1-x2 x5 = x3. total_seconds() print(x5) print(type(x5)) print(type(x1)) print(x1) # if you are working with Dataframe then use loop (* for-loop).

How do I get hours from Timedelta?

We can follow the same logic to convert a timedelta to hours. Instead of dividing the total_seconds() by the number of seconds in a minute, or dividing the timedelta object by timedelta(minutes=1) , we do it for hour.


2 Answers

You can use pd.to_timedelta or np.timedelta64 to define a duration and divide by this:

# set up as per @EdChum df['total_days_td'] = df['time_delta'] / pd.to_timedelta(1, unit='D') df['total_days_td'] = df['time_delta'] / np.timedelta64(1, 'D') 
like image 156
jpp Avatar answered Oct 19 '22 15:10

jpp


You can use dt.total_seconds and divide this by the total number of seconds in a day, example:

In [25]: df = pd.DataFrame({'dates':pd.date_range(dt.datetime(2016,1,1, 12,15,3), periods=10)}) df  Out[25]:                 dates 0 2016-01-01 12:15:03 1 2016-01-02 12:15:03 2 2016-01-03 12:15:03 3 2016-01-04 12:15:03 4 2016-01-05 12:15:03 5 2016-01-06 12:15:03 6 2016-01-07 12:15:03 7 2016-01-08 12:15:03 8 2016-01-09 12:15:03 9 2016-01-10 12:15:03  In [26]: df['time_delta'] = df['dates'] - pd.datetime(2015,11,6,8,10) df  Out[26]:                 dates       time_delta 0 2016-01-01 12:15:03 56 days 04:05:03 1 2016-01-02 12:15:03 57 days 04:05:03 2 2016-01-03 12:15:03 58 days 04:05:03 3 2016-01-04 12:15:03 59 days 04:05:03 4 2016-01-05 12:15:03 60 days 04:05:03 5 2016-01-06 12:15:03 61 days 04:05:03 6 2016-01-07 12:15:03 62 days 04:05:03 7 2016-01-08 12:15:03 63 days 04:05:03 8 2016-01-09 12:15:03 64 days 04:05:03 9 2016-01-10 12:15:03 65 days 04:05:03  In [27]: df['total_days_td'] = df['time_delta'].dt.total_seconds() / (24 * 60 * 60) df  Out[27]:                 dates       time_delta  total_days_td 0 2016-01-01 12:15:03 56 days 04:05:03      56.170174 1 2016-01-02 12:15:03 57 days 04:05:03      57.170174 2 2016-01-03 12:15:03 58 days 04:05:03      58.170174 3 2016-01-04 12:15:03 59 days 04:05:03      59.170174 4 2016-01-05 12:15:03 60 days 04:05:03      60.170174 5 2016-01-06 12:15:03 61 days 04:05:03      61.170174 6 2016-01-07 12:15:03 62 days 04:05:03      62.170174 7 2016-01-08 12:15:03 63 days 04:05:03      63.170174 8 2016-01-09 12:15:03 64 days 04:05:03      64.170174 9 2016-01-10 12:15:03 65 days 04:05:03      65.170174 
like image 25
EdChum Avatar answered Oct 19 '22 15:10

EdChum