Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to convert a (possibly negative) Pandas TimeDelta in minutes (float)?

I have a dataframe like this

df[['timestamp_utc','minute_ts','delta']].head()
Out[47]: 
            timestamp_utc           minute_ts                    delta
0 2015-05-21 14:06:33.414 2015-05-21 12:06:00 -1 days +21:59:26.586000
1 2015-05-21 14:06:33.414 2015-05-21 12:07:00 -1 days +22:00:26.586000
2 2015-05-21 14:06:33.414 2015-05-21 12:08:00 -1 days +22:01:26.586000
3 2015-05-21 14:06:33.414 2015-05-21 12:09:00 -1 days +22:02:26.586000
4 2015-05-21 14:06:33.414 2015-05-21 12:10:00 -1 days +22:03:26.586000

Where df['delta']=df.minute_ts-df.timestamp_utc

timestamp_utc     datetime64[ns]
minute_ts         datetime64[ns]
delta            timedelta64[ns]

Problem is, I would like to get the number of (possibly negative) minutes between timestamp_utc and minutes_ts, disregarding the seconds component.

So for the first row I would like to get -120. Indeed,2015-05-21 12:06:00 is 120 minutes before 2015-05-21 14:06:33.414.

What is the most pandaesque way to do it?

Many thanks!

like image 397
ℕʘʘḆḽḘ Avatar asked Jan 05 '23 14:01

ℕʘʘḆḽḘ


2 Answers

You can use:

df['a'] = df['delta'] / np.timedelta64(1, 'm')
print (df)
            timestamp_utc           minute_ts                    delta  \
0 2015-05-21 14:06:33.414 2015-05-21 12:06:00 -1 days +21:59:26.586000   
1 2015-05-21 14:06:33.414 2015-05-21 12:07:00 -1 days +22:00:26.586000   
2 2015-05-21 14:06:33.414 2015-05-21 12:08:00 -1 days +22:01:26.586000   
3 2015-05-21 14:06:33.414 2015-05-21 12:09:00 -1 days +22:02:26.586000   
4 2015-05-21 14:06:33.414 2015-05-21 12:10:00 -1 days +22:03:26.586000   

          a  
0 -120.5569  
1 -119.5569  
2 -118.5569  
3 -117.5569  
4 -116.5569  

And then convert float to int:

df['a'] = (df['delta'] / np.timedelta64(1, 'm')).astype(int)
print (df)
            timestamp_utc           minute_ts                    delta    a
0 2015-05-21 14:06:33.414 2015-05-21 12:06:00 -1 days +21:59:26.586000 -120
1 2015-05-21 14:06:33.414 2015-05-21 12:07:00 -1 days +22:00:26.586000 -119
2 2015-05-21 14:06:33.414 2015-05-21 12:08:00 -1 days +22:01:26.586000 -118
3 2015-05-21 14:06:33.414 2015-05-21 12:09:00 -1 days +22:02:26.586000 -117
4 2015-05-21 14:06:33.414 2015-05-21 12:10:00 -1 days +22:03:26.586000 -116
like image 95
jezrael Avatar answered Jan 16 '23 22:01

jezrael


You can use the Timedelta object in Pandas, and then use floor division in a list comprehension to calculate the minutes. Note that the seconds property of Timedelta returns the number of seconds (>= 0 and less than 1 day), so that you must explicitly convert days to the corresponding minutes.

df = pd.DataFrame({'minute_ts': [pd.Timestamp('2015-05-21 12:06:00'), 
                                 pd.Timestamp('2015-05-21 12:07:00'), 
                                 pd.Timestamp('2015-05-21 12:08:00'), 
                                 pd.Timestamp('2015-05-21 12:09:00'), 
                                 pd.Timestamp('2015-05-21 12:10:00')], 
                   'timestamp_utc': [pd.Timestamp('2015-05-21 14:06:33.414')] * 5})

df['minutes_neg'] = [td.days * 24 * 60 + td.seconds//60 
                 for td in [pd.Timedelta(delta) 
                            for delta in df.minute_ts - df.timestamp_utc]]

df['minutes_pos'] = [td.days * 24 * 60 + td.seconds//60 
                 for td in [pd.Timedelta(delta) 
                            for delta in df.timestamp_utc - df.minute_ts]]

>>> df
            minute_ts           timestamp_utc  minutes_neg  minutes_pos
0 2015-05-21 12:06:00 2015-05-21 14:06:33.414         -121          120
1 2015-05-21 12:07:00 2015-05-21 14:06:33.414         -120          119
2 2015-05-21 12:08:00 2015-05-21 14:06:33.414         -119          118
3 2015-05-21 12:09:00 2015-05-21 14:06:33.414         -118          117
4 2015-05-21 12:10:00 2015-05-21 14:06:33.414         -117          116

Note that the minutes are off by one because of floor division. For example, 90 // 60 = 1, but -90 // 60 = -2. You could add one to the result if it is negative, but there is the edge case of exactly one minute (measured at millisecond precision) would be off by one minute.

like image 30
Alexander Avatar answered Jan 16 '23 20:01

Alexander