I have a python pandas data frame, which contains 2 columns: time1
and time2
:
time1 time2
13:00:07.294234 13:00:07.294234
14:00:07.294234 14:00:07.394234
15:00:07.294234 15:00:07.494234
16:00:07.294234 16:00:07.694234
How can I generate a third column which contains the microsecond difference between time1
and time2
, in integer if possible?
If you prepend hese with an actual date you can convert them to datetime64 columns:
In [11]: '2014-03-19 ' + df
Out[11]:
time1 time2
0 2014-03-19 13:00:07.294234 2014-03-19 13:00:07.294234
1 2014-03-19 14:00:07.294234 2014-03-19 14:00:07.394234
2 2014-03-19 15:00:07.294234 2014-03-19 15:00:07.494234
3 2014-03-19 16:00:07.294234 2014-03-19 16:00:07.694234
[4 rows x 2 columns]
In [12]: df = ('2014-03-19 ' + df).astype('datetime64[ns]')
Out[12]:
time1 time2
0 2014-03-19 20:00:07.294234 2014-03-19 20:00:07.294234
1 2014-03-19 21:00:07.294234 2014-03-19 21:00:07.394234
2 2014-03-19 22:00:07.294234 2014-03-19 22:00:07.494234
3 2014-03-19 23:00:07.294234 2014-03-19 23:00:07.694234
Now you can subtract these columns:
In [13]: delta = df['time2'] - df['time1']
In [14]: delta
Out[14]:
0 00:00:00
1 00:00:00.100000
2 00:00:00.200000
3 00:00:00.400000
dtype: timedelta64[ns]
To get the number of microseconds, just divide the underlying nanoseconds by 1000:
In [15]: t.astype(np.int64) / 10**3
Out[15]:
0 0
1 100000
2 200000
3 400000
dtype: int64
As Jeff points out, on recent versions of numpy you can divide by 1 micro second:
In [16]: t / np.timedelta64(1,'us')
Out[16]:
0 0
1 100000
2 200000
3 400000
dtype: float64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With