Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Shorten large stack traces when using libraries

I work very often with large libraries like pandas, or matplotlib.

This means that exceptions often produce long stack traces.

Since the error is extremely rarely with the library, and extremely often with my own code, I don't need to see the library detail in the vast majority of cases.

A couple of common examples:

Pandas

>>> import pandas as pd
>>> df = pd.DataFrame(dict(a=[1,2,3]))
>>> df['b'] # Hint: there _is_ no 'b'

Here I've attempted to access an unknown key. This simple error produces a stacktrace containing 28 lines:

Traceback (most recent call last):
  File "an_arbitrary_python\lib\site-packages\pandas\core\indexes\base.py", line 2393, in get_loc
    return self._engine.get_loc(key)
  File "pandas\_libs\index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5239)
  File "pandas\_libs\index.pyx", line 154, in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5085)
  File "pandas\_libs\hashtable_class_helper.pxi", line 1207, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20405)
  File "pandas\_libs\hashtable_class_helper.pxi", line 1215, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20359)
KeyError: 'b'

During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "an_arbitrary_python\lib\site-packages\pandas\core\frame.py", line 2062, in __getitem__
        return self._getitem_column(key)
      File "an_arbitrary_python\lib\site-packages\pandas\core\frame.py", line 2069, in _getitem_column
        return self._get_item_cache(key)
      File "an_arbitrary_python\lib\site-packages\pandas\core\generic.py", line 1534, in _get_item_cache
        values = self._data.get(item)
      File "an_arbitrary_python\lib\site-packages\pandas\core\internals.py", line 3590, in get
        loc = self.items.get_loc(item)
      File "an_arbitrary_python\lib\site-packages\pandas\core\indexes\base.py", line 2395, in get_loc
        return self._engine.get_loc(self._maybe_cast_indexer(key))
      File "pandas\_libs\index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5239)
      File "pandas\_libs\index.pyx", line 154, in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5085)
      File "pandas\_libs\hashtable_class_helper.pxi", line 1207, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20405)
      File "pandas\_libs\hashtable_class_helper.pxi", line 1215, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20359)
    KeyError: 'b'

Knowing that I ended up in hashtable_class_helper.pxi is almost never helpful for me. I need to know where in my code I've messed up.

Matplotlib

>>> import matplotlib.pyplot as plt
>>> import matplotlib.cm as cm
>>> def foo():
...     plt.plot([1,2,3], cbap=cm.Blues) # cbap is a typo for cmap
...
>>> def bar():
...     foo()
...
>>> bar()

This time, there's a typo in my keyword argument. But I still have to see 25 lines of stack trace:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in bar
  File "<stdin>", line 2, in foo
  File "an_arbitrary_python\lib\site-packages\matplotlib\pyplot.py", line 3317, in plot
    ret = ax.plot(*args, **kwargs)
  File "an_arbitrary_python\lib\site-packages\matplotlib\__init__.py", line 1897, in inner
    return func(ax, *args, **kwargs)
  File "an_arbitrary_python\lib\site-packages\matplotlib\axes\_axes.py", line 1406, in plot
    for line in self._get_lines(*args, **kwargs):
  File "an_arbitrary_python\lib\site-packages\matplotlib\axes\_base.py", line 407, in _grab_next_args
    for seg in self._plot_args(remaining, kwargs):
  File "an_arbitrary_python\lib\site-packages\matplotlib\axes\_base.py", line 395, in _plot_args
    seg = func(x[:, j % ncx], y[:, j % ncy], kw, kwargs)
  File "an_arbitrary_python\lib\site-packages\matplotlib\axes\_base.py", line 302, in _makeline
    seg = mlines.Line2D(x, y, **kw)
  File "an_arbitrary_python\lib\site-packages\matplotlib\lines.py", line 431, in __init__
    self.update(kwargs)
  File "an_arbitrary_python\lib\site-packages\matplotlib\artist.py", line 885, in update
    for k, v in props.items()]
  File "an_arbitrary_python\lib\site-packages\matplotlib\artist.py", line 885, in <listcomp>
    for k, v in props.items()]
  File "an_arbitrary_python\lib\site-packages\matplotlib\artist.py", line 878, in _update_property
    raise AttributeError('Unknown property %s' % k)
AttributeError: Unknown property cbap

Here I get to find out that I ended on a line in artist.py that raises an AttributeError, and then see directly underneath that the AttributeError was indeed raised. This is not much value add in information terms.

In these trivial, interactive examples, you might just say "Look at the top of the stack trace, not the bottom", but often my foolish typo has occurred within a function so the line I'm interested in is somewhere in the middle of these library-cluttered stack traces.

Is there any way I can make these stack traces less verbose, and help me find the source of the problem, which almost always lies with my own code and not in the libraries I happen to be employing?

like image 563
LondonRob Avatar asked Aug 23 '17 15:08

LondonRob


People also ask

What causes stack trace error?

In some cases, an error occurs when you send incorrect input to one of the third-party libraries you use. As you might expect, your program will print a stack trace of the function calls leading up to the problem.

How do I print a full stack trace?

If you want to programmatically print out the stacktrace without the ellipsis for common traces, then you can use Throwable. getStackTrace() and print out all the elements yourself.

Should you log stack trace?

Logging the stack traces of runtime exceptions assists developers in diagnosing runtime failures. However, unnecessary logging of exception stack traces can have many negative impacts such as polluting log files.


1 Answers

You can use traceback to have better control over exception printing. For example:

import pandas as pd
import traceback

try:
    df = pd.DataFrame(dict(a=[1,2,3]))
    df['b']

except Exception, e:
    traceback.print_exc(limit=1)
    exit(1)

This triggers the standard exception printing mechanism, but only shows you the first frame of the stack trace (which is the one you care about in your example). For me this produces:

Traceback (most recent call last):
  File "t.py", line 6, in <module>
    df['b']
KeyError: 'b'

Obviously you lose the context, which will be important when debugging your own code. If we want to get fancy, we can try and devise a test and see how far the traceback should go. For example:

def find_depth(tb, continue_test):
    depth = 0

    while tb is not None:
        filename = tb.tb_frame.f_code.co_filename

        # Run the test we're given against the filename
        if not continue_test(filename):
            return depth

        tb = tb.tb_next
        depth += 1

I don't know how you're organising and running your code, but perhaps you can then do something like:

import pandas as pd
import traceback
import sys

def find_depth():
    # ... code from above here ...

try:
    df = pd.DataFrame(dict(a=[1, 2, 3]))
    df['b']

except Exception, e:
    traceback.print_exc(limit=get_depth(
        sys.exc_info()[2],
        # The test for which frames we should include
        lambda filename: filename.startswith('my_module')
    ))
    exit(1)
like image 182
Jon Betts Avatar answered Nov 08 '22 22:11

Jon Betts