Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python - Calculate time difference between rows regarding their labels from a txt file

enter image description hereI am reading data from a txt file which has time stamps. I need to read data from a txt file and write the result in a different txt file. Therefore, I need to sort the data.

For example, I need to calculate time difference for XXXXXX between 2020-08-28T11:46:24.8419656Z and 2020-08-28T11:48:11.8418281Z, which is total time diff. To calculate "Execution" time, I need to subtract between 2020-08-28T11:48:11.8418281Z and 2020-08-28T11:46:39.9417366Z. These are just example to calculate time diff. If there is a error, I need to print in "Test Status" as 1. There is an error in YYYYYY so I just need to assign time status if they are not exist as 0. In output, I gave the values to show them as an example.

How can I calculate time diff because there is T in middle of time stamp? Also another challenge is that I need to calculate between two rows regarding their label in column. To find out the name of time stamps(e.g XXXXXXX), I need to check "#########" and then I can sort it otherwise I dont know which name is coming in txt file.

from datetime import datetime

def time_diff(start, end):
    start_dt = datetime.strptime(start, '%H:%M:%S')
    end_dt = datetime.strptime(end, '%H:%M:%S')
    diff = (end_dt - start_dt)
    return diff.seconds

scores = {}
with open('input.txt') as fin:
    for line in fin.readlines():
        values = line.split(',')
        scores[values[0]] = time_diff(values[0],values[0])

with open('result.txt', 'w') as fout:
    for key, value in sorted(scores.iteritems(), key=lambda (k,v): (v,k)):
        fout.write('%s,%s\n' % (key, value))

INPUT:

2020-08-28T11:46:24.8419656Z ################################################################################
2020-08-28T11:46:24.8419656Z XXXXXX
2020-08-28T11:46:39.9397372Z Execution 0
2020-08-28T11:46:39.9417366Z Creation 0
2020-08-28T11:46:41.4877509Z Build 0
2020-08-28T11:48:02.6957708Z Level 0 
2020-08-28T11:48:02.7227683Z Converting file start
2020-08-28T11:48:11.7408315Z Converting done 0
2020-08-28T11:48:11.8148285Z Checking results
2020-08-28T11:48:11.8418281Z Test Status XXXXXX: Success
2020-08-28T11:48:11.8498273Z ################################################################################
2020-08-28T11:48:11.8498273Z YYYYYY
2020-08-28T11:48:27.1533026Z Execution 0
2020-08-28T11:48:27.1583035Z Creation 0
2020-08-28T11:48:28.6763028Z Build 0
2020-08-28T11:49:31.9180832Z Level 0 
2020-08-28T11:49:31.9440848Z ##[error]
2020-08-28T11:49:31.9530839Z ################################################################################
2020-08-28T11:50:24.8419656Z ZZZZZZ
2020-08-28T11:50:39.9397372Z Execution 0
2020-08-28T11:50:39.9417366Z Creation 0
2020-08-28T11:50:41.4877509Z Build 0
2020-08-28T11:51:02.6957708Z Level 0 
2020-08-28T11:51:02.7227683Z Converting file start
2020-08-28T11:51:11.7408315Z Converting done 0
2020-08-28T11:51:11.8148285Z Checking results
2020-08-28T11:51:11.8418281Z Test Status ZZZZZZ: Success
2020-08-28T11:51:31.9530839Z ################################################################################



OUTPUT:

Name       Total    Execution Creation Build Level Converting  Checking results   Test Status      
XXXXXX      10          2        2       2     2        2          2       2          0
YYYYYY      10          2        2       2     2        0          0       0          1
ZZZZZZ      10          2        2       2     2        2          2       2          0
like image 480
nobody Avatar asked Feb 13 '26 14:02

nobody


1 Answers

import re
from dateutil import parser
import pandas as pd

with open('input.txt') as file:
    data = file.read()

timestamps = re.findall(r'(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}.+Z)\s#{3,}', data)
text = []
dict_list = []
for i in range(len(timestamps)-1):
    text.append(data[data.index(timestamps[i]):data.index(timestamps[i+1])])
    time_diff = parser.isoparse(timestamps[i+1]) - parser.isoparse(timestamps[i])
    # print(text[-1])
    lines = text[-1].split('\n')
    dict = {}
    dict['name'] = lines[1].split(' ')[1]
    dict['execution'] = (parser.isoparse(lines[3].split(' ')[0]) - parser.isoparse(lines[2].split(' ')[0])).seconds
    dict['creation'] = (parser.isoparse(lines[4].split(' ')[0]) - parser.isoparse(lines[3].split(' ')[0])).seconds
    dict['build'] = (parser.isoparse(lines[5].split(' ')[0]) - parser.isoparse(lines[4].split(' ')[0])).seconds
    dict['level'] = (parser.isoparse(lines[6].split(' ')[0]) - parser.isoparse(lines[5].split(' ')[0])).seconds
    if "error" in lines[-2]:
        dict['test_status'] = 1
        dict_list.append(dict)
        continue
    elif "Success" in lines[-2]:
        dict['test_status'] = 0
        dict['converting'] = (parser.isoparse(lines[7].split(' ')[0]) - parser.isoparse(lines[6].split(' ')[0])).seconds
        dict['checking'] = (parser.isoparse(lines[8].split(' ')[0]) - parser.isoparse(lines[7].split(' ')[0])).seconds
    dict_list.append(dict)


df = pd.DataFrame(dict_list)
df.to_csv('output.csv')

You can get all timestamps in this way and then you can get data between two timestamps by slicing data. Let me know if there's any issue.

like image 116
r0ot293 Avatar answered Feb 16 '26 03:02

r0ot293



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!