I'm doing some work with logs. Need to calculate a sum of time duration when the process was running without long interruptions. Set the maximum possible interruption to 30 seconds. Logs are emitted every 3 seconds.
So, for example if it was running since 10:20:00
(hours) to 10:30:00
and was interrupted from 10:24:10
to 10:27:10
, the desired result is the sum of 10:24:10
- 10:20:00
and 10:30:00
- 10:27:10
= 420
(in seconds). However, calculating time difference using datetime
types does not provide a valid solution - I suppose it calculates a difference without including a start/end seconds.
here is the solution I came up with (['timestamps'] is a list of datetime
timestamps normally emitted every 3 sec):
for k, v in proc_activity.items():
proc_activity[k]['duration'] = 0
start, next = v['timestamps'][0], ''
for time in v['timestamps']:
next = time
diff = next - start
if diff.seconds < 30:
proc_activity[k]['duration'] += diff.seconds
else:
print("diff: %s" % diff.seconds)
start = next
print(f"added: {proc_activity[k]['duration']}")
diff = v['timestamps'][-1] - v['timestamps'][0]
print(f"real: {diff.seconds}")
output:
added: 39
real: 45
added: 39
real: 45
diff: 36
added: 155
real: 218
any suggestion how to fix it?
update, sample input data:
{'service_0': {'timestamps': [datetime.datetime(2018, 7, 1, 22, 33, 39, 86170),
datetime.datetime(2018, 7, 1, 22, 33, 42, 33213),
datetime.datetime(2018, 7, 1, 22, 33, 44, 898234),
datetime.datetime(2018, 7, 1, 22, 33, 47, 893731),
datetime.datetime(2018, 7, 1, 22, 33, 50, 928946),
datetime.datetime(2018, 7, 1, 22, 33, 53, 895617),
datetime.datetime(2018, 7, 1, 22, 35, 7, 116182),
datetime.datetime(2018, 7, 1, 22, 35, 10, 105035),
datetime.datetime(2018, 7, 1, 22, 35, 13, 193428),
datetime.datetime(2018, 7, 1, 22, 35, 16, 210135),
datetime.datetime(2018, 7, 1, 22, 35, 19, 168881),
datetime.datetime(2018, 7, 1, 22, 35, 22, 114653),
datetime.datetime(2018, 7, 1, 22, 35, 25, 102365),
datetime.datetime(2018, 7, 1, 22, 35, 43, 46950),
datetime.datetime(2018, 7, 1, 22, 35, 46, 15435),
datetime.datetime(2018, 7, 1, 22, 35, 49, 23333),
datetime.datetime(2018, 7, 1, 22, 35, 52, 22164),
datetime.datetime(2018, 7, 1, 22, 35, 55, 78615),
datetime.datetime(2018, 7, 1, 22, 35, 58, 78573)]}}
In short, I think the key thing you are missing is to use timedelta.total_seconds()
rather than timedelta.seconds
This seems to work fine for me:
import datetime
from pprint import pprint
def get_duration(timestamps):
max_interruption = 30
starts = timestamps[:-1]
ends = timestamps[1:]
durations = zip(starts, ends)
accumulated = 0
for start, end in durations:
delta = (end - start).total_seconds()
if delta < max_interruption:
accumulated += delta
return accumulated
proc_activity = {
'service_0': {
'timestamps': [
datetime.datetime(2018, 7, 1, 22, 33, 39, 86170),
datetime.datetime(2018, 7, 1, 22, 33, 42, 33213),
datetime.datetime(2018, 7, 1, 22, 33, 44, 898234),
datetime.datetime(2018, 7, 1, 22, 33, 47, 893731),
datetime.datetime(2018, 7, 1, 22, 33, 50, 928946),
datetime.datetime(2018, 7, 1, 22, 33, 53, 895617),
datetime.datetime(2018, 7, 1, 22, 35, 7, 116182),
datetime.datetime(2018, 7, 1, 22, 35, 10, 105035),
datetime.datetime(2018, 7, 1, 22, 35, 13, 193428),
datetime.datetime(2018, 7, 1, 22, 35, 16, 210135),
datetime.datetime(2018, 7, 1, 22, 35, 19, 168881),
datetime.datetime(2018, 7, 1, 22, 35, 22, 114653),
datetime.datetime(2018, 7, 1, 22, 35, 25, 102365),
datetime.datetime(2018, 7, 1, 22, 35, 43, 46950),
datetime.datetime(2018, 7, 1, 22, 35, 46, 15435),
datetime.datetime(2018, 7, 1, 22, 35, 49, 23333),
datetime.datetime(2018, 7, 1, 22, 35, 52, 22164),
datetime.datetime(2018, 7, 1, 22, 35, 55, 78615),
datetime.datetime(2018, 7, 1, 22, 35, 58, 78573)
],
}
}
for k,v in proc_activity.items():
proc_activity[k]['duration'] = get_duration(v['timestamps'])
pprint(proc_activity)
has a duration of 65.77183800000002
seconds
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With