I have a dataframe timings as follows:
start_ms end_ms
0 2020-09-01T08:11:19.336Z 2020-09-01T08:11:19.336Z
1 2020-09-01T08:11:20.652Z 2020-09-01T08:11:20.662Z
2 2020-09-01T08:11:20.670Z 2020-09-01T08:11:20.688Z
I'm trying to calculate the time difference between the start_ms and end_ms of each row in milliseconds, i.e. I wish to get the result
start_ms end_ms diff
0 2020-09-01T08:11:19.336Z 2020-09-01T08:11:19.336Z 0
1 2020-09-01T08:11:20.652Z 2020-09-01T08:11:20.662Z 10
2 2020-09-01T08:11:20.670Z 2020-09-01T08:11:20.688Z 18
I can convert the timestamp to datetime column by column, but I'm not sure if the order of the values are retained.
start_ms_time = pd.to_datetime(timings['start_ms'])
end_ms_time = pd.to_datetime(timings['end_ms'])
Is it possible to convert the timestamps to datetime inside timings, and add the time difference column? Do I even need to convert to get the difference? How do I calculate the time difference in milliseconds?
Subtract columns by Series.sub and then use Series.dt.components:
start_ms_time = pd.to_datetime(timings['start_ms'])
end_ms_time = pd.to_datetime(timings['end_ms'])
timings['diff'] = end_ms_time.sub(start_ms_time).dt.components.milliseconds
print (timings)
start_ms end_ms diff
0 2020-09-01T08:11:19.336Z 2020-09-01T08:11:19.336Z 0
1 2020-09-01T08:11:20.652Z 2020-09-01T08:11:20.662Z 10
2 2020-09-01T08:11:20.670Z 2020-09-01T08:11:20.688Z 18
Or Series.dt.total_seconds with multiple by 1000 and cast to integers:
timings['diff'] = end_ms_time.sub(start_ms_time).dt.total_seconds().mul(1000).astype(int)
print (timings)
start_ms end_ms diff
0 2020-09-01T08:11:19.336Z 2020-09-01T08:11:19.336Z 0
1 2020-09-01T08:11:20.652Z 2020-09-01T08:11:20.662Z 10
2 2020-09-01T08:11:20.670Z 2020-09-01T08:11:20.688Z 18
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With