I have a dataframe like this
df[['timestamp_utc','minute_ts','delta']].head()
Out[47]:
timestamp_utc minute_ts delta
0 2015-05-21 14:06:33.414 2015-05-21 12:06:00 -1 days +21:59:26.586000
1 2015-05-21 14:06:33.414 2015-05-21 12:07:00 -1 days +22:00:26.586000
2 2015-05-21 14:06:33.414 2015-05-21 12:08:00 -1 days +22:01:26.586000
3 2015-05-21 14:06:33.414 2015-05-21 12:09:00 -1 days +22:02:26.586000
4 2015-05-21 14:06:33.414 2015-05-21 12:10:00 -1 days +22:03:26.586000
Where df['delta']=df.minute_ts-df.timestamp_utc
timestamp_utc datetime64[ns]
minute_ts datetime64[ns]
delta timedelta64[ns]
Problem is, I would like to get the number of (possibly negative) minutes between timestamp_utc
and minutes_ts
, disregarding the seconds component.
So for the first row I would like to get -120
. Indeed,2015-05-21 12:06:00
is 120 minutes before 2015-05-21 14:06:33.414
.
What is the most pandaesque way to do it?
Many thanks!
You can use:
df['a'] = df['delta'] / np.timedelta64(1, 'm')
print (df)
timestamp_utc minute_ts delta \
0 2015-05-21 14:06:33.414 2015-05-21 12:06:00 -1 days +21:59:26.586000
1 2015-05-21 14:06:33.414 2015-05-21 12:07:00 -1 days +22:00:26.586000
2 2015-05-21 14:06:33.414 2015-05-21 12:08:00 -1 days +22:01:26.586000
3 2015-05-21 14:06:33.414 2015-05-21 12:09:00 -1 days +22:02:26.586000
4 2015-05-21 14:06:33.414 2015-05-21 12:10:00 -1 days +22:03:26.586000
a
0 -120.5569
1 -119.5569
2 -118.5569
3 -117.5569
4 -116.5569
And then convert float
to int
:
df['a'] = (df['delta'] / np.timedelta64(1, 'm')).astype(int)
print (df)
timestamp_utc minute_ts delta a
0 2015-05-21 14:06:33.414 2015-05-21 12:06:00 -1 days +21:59:26.586000 -120
1 2015-05-21 14:06:33.414 2015-05-21 12:07:00 -1 days +22:00:26.586000 -119
2 2015-05-21 14:06:33.414 2015-05-21 12:08:00 -1 days +22:01:26.586000 -118
3 2015-05-21 14:06:33.414 2015-05-21 12:09:00 -1 days +22:02:26.586000 -117
4 2015-05-21 14:06:33.414 2015-05-21 12:10:00 -1 days +22:03:26.586000 -116
You can use the Timedelta object in Pandas, and then use floor division in a list comprehension to calculate the minutes. Note that the seconds property of Timedelta
returns the number of seconds (>= 0 and less than 1 day), so that you must explicitly convert days to the corresponding minutes.
df = pd.DataFrame({'minute_ts': [pd.Timestamp('2015-05-21 12:06:00'),
pd.Timestamp('2015-05-21 12:07:00'),
pd.Timestamp('2015-05-21 12:08:00'),
pd.Timestamp('2015-05-21 12:09:00'),
pd.Timestamp('2015-05-21 12:10:00')],
'timestamp_utc': [pd.Timestamp('2015-05-21 14:06:33.414')] * 5})
df['minutes_neg'] = [td.days * 24 * 60 + td.seconds//60
for td in [pd.Timedelta(delta)
for delta in df.minute_ts - df.timestamp_utc]]
df['minutes_pos'] = [td.days * 24 * 60 + td.seconds//60
for td in [pd.Timedelta(delta)
for delta in df.timestamp_utc - df.minute_ts]]
>>> df
minute_ts timestamp_utc minutes_neg minutes_pos
0 2015-05-21 12:06:00 2015-05-21 14:06:33.414 -121 120
1 2015-05-21 12:07:00 2015-05-21 14:06:33.414 -120 119
2 2015-05-21 12:08:00 2015-05-21 14:06:33.414 -119 118
3 2015-05-21 12:09:00 2015-05-21 14:06:33.414 -118 117
4 2015-05-21 12:10:00 2015-05-21 14:06:33.414 -117 116
Note that the minutes are off by one because of floor division. For example, 90 // 60 = 1, but -90 // 60 = -2. You could add one to the result if it is negative, but there is the edge case of exactly one minute (measured at millisecond precision) would be off by one minute.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With