How can I use awk or grep to capture an entire Python traceback in a log file?

Question

A log file contains a number of Python tracebacks. I only care about tracebacks raised because of a KevinCustomError. There may be more than one of this class of error in the file.

How can I use grep, another popular unix command, or a combination thereof to dump the entire traceback for my specific error?

Here's an example log file. I would like lines 1-3 from this file. In the real log file the tracebacks are much longer.

Traceback (most recent call last):
  File "<stdin>", line 1, in ?
KevinCustomError: integer division or modulo by zero
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ZeroDivisionError: integer division or modulo by zero

voithos · Accepted Answer

Here's an AWK script I tried whipping together.

awk '{a[NR]=$0}; /KevinCustomError/ {for(i=0; a[NR-i] !~ /Traceback/; i++) {} i++; while(i-- >= 0) {print a[NR-i]}}' logfile

Or, in file form.

{a[NR] = $0};

{
    if ($0 ~ /KevinCustomError/)
    {
        for (i = 0; a[NR-i] !~ /Traceback/; i++)
        {}
        i++
        while (i-- >= 0)
        {
            print a[NR-i];
        }
    }
}

Used like: awk -f logscript.awk logfile.

Not too familiar with AWK, so any criticism is welcome. Basically, it keeps track of all lines read so far, and just searches backwards to find a "Traceback" token (which you can replace if you'd like), and then prints everything in between (in the correct order).

Enlico · Answer

If I have correctly understood the structure of the Python log file, the following is a tiny sed solution, which is less cryptic than it seems

#!/usr/bin/sed -f
/^Traceback/{
:here
N
/\nKevinCustomError/b
s/.*\n$Traceback$/\1/
b here
}

In summary, the script takes action only on lines starting by Traceback; on these lines, the script keeps appending a new line (N) and subsequently checking if the newly added line starts by KevinCustomError; if this is the case, the script branches to the end and prints the multiline pattern space; if not, the script removes everything but the last Traceback-starting line from the pattern space, and then branches back to :here and appends another line (N), and so on.

In detail, it works as follows:

#!/usr/bin/sed -f: this is the shebang line, which tells the shell to use /usr/bin/sed as the interpreter, and to pass the file argument to it through the -f option (this allows executing ./script file instead of sed -f script file);
/^Traceback/ only matches the lines that start by Traceback;
{…} groups the commands that are executed only for those lines matched at step 2;
1. :here is not a command, but just a label which marks the line where we can come back to by means of a test or branch command;
2. N reads and appends the following line of text the current pattern space inserting a newline \n in between, which makes the pattern space a multiline;
3. /\nKevinCustomError/b, this branches to the end of the script if the pattern space contains KevinCustomError preceded by a \n;
  - This results in printing the multiline pattern space, which starts by Traceback, contains (at least) a \n in it, and contains KevinCustomError just after the (last) \n;
4. s/.*\n$Traceback$/\1/ (we are here if the pattern in 3. did not match) deletes the leading part of the patterns space up to and including the last \n;
5. b here branches to :here, and no printing occurs at this point.

How can I use awk or grep to capture an entire Python traceback in a log file?

Tags:

python

grep

logging

awk

Kevin Burke

2 Answers

voithos

Enlico

Recent Activity

Donate For Us

How can I use awk or grep to capture an entire Python traceback in a log file?

Tags:

python

grep

logging

awk

Kevin Burke

2 Answers

voithos

Enlico

Related questions

Recent Activity

Donate For Us