Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

log analyze: finding lines by time difference

I have a long log file generated with log4j, 10 threads writing to log. I am looking for log analyzer tool that could find lines where user waited for a long time (i.e where the difference between log entries for the same thread is more than a minute).

P.S I am trying to use OtrosLogViewer, but it gives filtering by certain values (for example, by thread ID), and does not compare between lines.

PPS the new version of OtrosLogViewer has a "Delta" column that calculates the difference between adj log lines (in ms)

thank you

like image 854
lili Avatar asked Aug 30 '12 14:08

lili


1 Answers

This simple Python script may be enough. For testing, I analized my local Apache log, which BTW uses the Common Log Format so you may even reuse it as-is. I simply compute the difference between two subsequent requests, and print the request line for deltas exceeding a certain threshold (1 second in my test). You may want to encapsulate the code in a function which also accepts a parameter with the thread ID, so you can filter further

#!/usr/bin/env python
import re
from datetime import datetime

THRESHOLD = 1

last = None
for line in open("/var/log/apache2/access.log"):
    # You may insert here something like
    # if not re.match(THREAD_ID, line):
    #   continue
    # Python does not support %z, hence the [:-6]
    current = datetime.strptime(
        re.search(r"\[([^]]+)]", line).group(1)[:-6],
        "%d/%b/%Y:%H:%M:%S")
    if last != None and (current - last).seconds > THRESHOLD:
        print re.search('"([^"]+)"', line).group(1)
    last = current
like image 72
Raffaele Avatar answered Oct 23 '22 11:10

Raffaele