i have a dataframe which contains two columns of datetime.time items. something like
col1 col2
02:10:00.008209 02:08:38.053145
02:10:00.567054 02:08:38.053145
02:10:00.609842 02:08:38.053145
02:10:00.728153 02:08:38.053145
02:10:02.394408 02:08:38.053145
how can i generate a col3 which is the differences between col1 and col2? (preferablly in microseconds)?
I searched around but I cannot find a solution here. Does anyone know?
Thanks!
don't use datetime.time
, use timedelta
:
import pandas as pd
import io
data = """col1 col2
02:10:00.008209 02:08:38.053145
02:10:00.567054 02:08:38.053145
02:10:00.609842 02:08:38.053145
02:10:00.728153 02:08:38.053145
02:10:02.394408 02:08:38.053145"""
df = pd.read_table(io.BytesIO(data), delim_whitespace=True)
df2 = df.apply(pd.to_timedelta)
diff = df2.col1 - df2.col2
diff.astype("i8")/1e9
the output is different in seconds:
0 81.955064
1 82.513909
2 82.556697
3 82.675008
4 84.341263
dtype: float64
To convert time dataframe to timedelta dataframe:
df.applymap(time.isoformat).apply(pd.to_timedelta)
Are you sure you want a DataFrame of datetime.time
objects? There is hardly an operation you can perform conveniently on these guys especially when wrapped in a DataFrame.
It might be better to have each column store an int representing the total number of microseconds.
You can convert df
to a DataFrame storing microseconds like this:
In [71]: df2 = df.applymap(lambda x: ((x.hour*60+x.minute)*60+x.second)*10**6+x.microsecond)
In [72]: df2
Out[72]:
col1 col2
0 7800008209 7718053145
1 7800567054 7718053145
And from there, it is easy to get the result you desire:
In [73]: df2['col1']-df2['col2']
Out[73]:
0 81955064
1 82513909
dtype: int64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With