Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python 3.6 sum of the short periods between timestamps

I'm doing some work with logs. Need to calculate a sum of time duration when the process was running without long interruptions. Set the maximum possible interruption to 30 seconds. Logs are emitted every 3 seconds.

So, for example if it was running since 10:20:00 (hours) to 10:30:00 and was interrupted from 10:24:10 to 10:27:10, the desired result is the sum of 10:24:10 - 10:20:00 and 10:30:00 - 10:27:10 = 420 (in seconds). However, calculating time difference using datetime types does not provide a valid solution - I suppose it calculates a difference without including a start/end seconds.

here is the solution I came up with (['timestamps'] is a list of datetime timestamps normally emitted every 3 sec):

for k, v in proc_activity.items():
        proc_activity[k]['duration'] = 0

        start, next = v['timestamps'][0], ''
        for time in v['timestamps']:
            next = time
            diff = next - start

            if diff.seconds < 30:
                proc_activity[k]['duration'] += diff.seconds
            else:
                print("diff: %s" % diff.seconds)

            start = next

        print(f"added: {proc_activity[k]['duration']}")
        diff = v['timestamps'][-1] - v['timestamps'][0]
        print(f"real: {diff.seconds}")

output:

added: 39
real: 45
added: 39
real: 45
diff: 36
added: 155
real: 218

any suggestion how to fix it?

update, sample input data:

{'service_0': {'timestamps': [datetime.datetime(2018, 7, 1, 22, 33, 39, 86170),
                                     datetime.datetime(2018, 7, 1, 22, 33, 42, 33213),
                                     datetime.datetime(2018, 7, 1, 22, 33, 44, 898234),
                                     datetime.datetime(2018, 7, 1, 22, 33, 47, 893731),
                                     datetime.datetime(2018, 7, 1, 22, 33, 50, 928946),
                                     datetime.datetime(2018, 7, 1, 22, 33, 53, 895617),
                                     datetime.datetime(2018, 7, 1, 22, 35, 7, 116182),
                                     datetime.datetime(2018, 7, 1, 22, 35, 10, 105035),
                                     datetime.datetime(2018, 7, 1, 22, 35, 13, 193428),
                                     datetime.datetime(2018, 7, 1, 22, 35, 16, 210135),
                                     datetime.datetime(2018, 7, 1, 22, 35, 19, 168881),
                                     datetime.datetime(2018, 7, 1, 22, 35, 22, 114653),
                                     datetime.datetime(2018, 7, 1, 22, 35, 25, 102365),
                                     datetime.datetime(2018, 7, 1, 22, 35, 43, 46950),
                                     datetime.datetime(2018, 7, 1, 22, 35, 46, 15435),
                                     datetime.datetime(2018, 7, 1, 22, 35, 49, 23333),
                                     datetime.datetime(2018, 7, 1, 22, 35, 52, 22164),
                                     datetime.datetime(2018, 7, 1, 22, 35, 55, 78615),
                                     datetime.datetime(2018, 7, 1, 22, 35, 58, 78573)]}}
like image 500
user1935987 Avatar asked Jun 28 '18 22:06

user1935987


1 Answers

In short, I think the key thing you are missing is to use timedelta.total_seconds() rather than timedelta.seconds

This seems to work fine for me:

import datetime
from pprint import pprint

def get_duration(timestamps):
    max_interruption = 30
    starts = timestamps[:-1]
    ends = timestamps[1:]
    durations = zip(starts, ends)
    accumulated = 0
    for start, end in durations:
        delta = (end - start).total_seconds()
        if delta < max_interruption:
            accumulated += delta
    return accumulated

proc_activity = {
    'service_0': {
        'timestamps': [
            datetime.datetime(2018, 7, 1, 22, 33, 39, 86170),
            datetime.datetime(2018, 7, 1, 22, 33, 42, 33213),
            datetime.datetime(2018, 7, 1, 22, 33, 44, 898234),
            datetime.datetime(2018, 7, 1, 22, 33, 47, 893731),
            datetime.datetime(2018, 7, 1, 22, 33, 50, 928946),
            datetime.datetime(2018, 7, 1, 22, 33, 53, 895617),
            datetime.datetime(2018, 7, 1, 22, 35, 7, 116182),
            datetime.datetime(2018, 7, 1, 22, 35, 10, 105035),
            datetime.datetime(2018, 7, 1, 22, 35, 13, 193428),
            datetime.datetime(2018, 7, 1, 22, 35, 16, 210135),
            datetime.datetime(2018, 7, 1, 22, 35, 19, 168881),
            datetime.datetime(2018, 7, 1, 22, 35, 22, 114653),
            datetime.datetime(2018, 7, 1, 22, 35, 25, 102365),
            datetime.datetime(2018, 7, 1, 22, 35, 43, 46950),
            datetime.datetime(2018, 7, 1, 22, 35, 46, 15435),
            datetime.datetime(2018, 7, 1, 22, 35, 49, 23333),
            datetime.datetime(2018, 7, 1, 22, 35, 52, 22164),
            datetime.datetime(2018, 7, 1, 22, 35, 55, 78615),
            datetime.datetime(2018, 7, 1, 22, 35, 58, 78573)
        ],
    }
}

for k,v in proc_activity.items():
    proc_activity[k]['duration'] = get_duration(v['timestamps'])

pprint(proc_activity)

has a duration of 65.77183800000002 seconds

like image 57
AnilRedshift Avatar answered Sep 23 '22 23:09

AnilRedshift