Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Start python debugger in oldest stack frame after an exception occurs

I use the --pdb command with ipython, so when I'm debugging code and an error occurs it shows a stack trace. A lot of these errors come from calling numpy or pandas functions with bad inputs. the stack trace starts at the newest frame, in code from these libraries. 5-10 repetitions of the up command later I can actually see what I did wrong, which will be immediately obvious 90% of the time (eg, calling with a list instead of an array).

Is there any way to specify which stack frame the debugger initially starts in? Either the oldest stack frame, or the newest stack frame in the python file initially run, or similar. This would be much more productive for debugging.

Here's a simple example

import pandas as pd

def test(df):  # (A)
    df[:,0] = 4 #Bad indexing on dataframe, will cause error
    return df

df = test(pd.DataFrame(range(3))) # (B)

Resulting traceback, (A), (B), (C) added for clarity

In [6]: ---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-6-66730543fac0> in <module>()
----> 1 import codecs, os;__pyfile = codecs.open('''/tmp/py29142W1d''', encoding='''utf-8''');__code = __pyfile.read().encode('''utf-8''');__pyfile.close();os.remove('''/tmp/py29142W1d''');exec(compile(__code, '''/test/stack_frames.py''', 'exec'));

/test/stack_frames.py in <module>()
      6 
      7 if __name__ == '__main__':
(A)----> 8     df = test(pd.DataFrame(range(3)))

/test/stack_frames.py in test(df)
      2 
      3 def test(df):
(B)----> 4     df[:,0] = 4
      5     return df
      6 

/usr/local/lib/python2.7/dist-packages/pandas/core/frame.pyc in __setitem__(self, key, value)
   2355         else:
   2356             # set column
-> 2357             self._set_item(key, value)
   2358 
   2359     def _setitem_slice(self, key, value):

/usr/local/lib/python2.7/dist-packages/pandas/core/frame.pyc in _set_item(self, key, value)
   2421 
   2422         self._ensure_valid_index(value)
-> 2423         value = self._sanitize_column(key, value)
   2424         NDFrame._set_item(self, key, value)
   2425 

/usr/local/lib/python2.7/dist-packages/pandas/core/frame.pyc in _sanitize_column(self, key, value)
   2602 
   2603         # broadcast across multiple columns if necessary
-> 2604         if key in self.columns and value.ndim == 1:
   2605             if (not self.columns.is_unique or
   2606                     isinstance(self.columns, MultiIndex)):

/usr/local/lib/python2.7/dist-packages/pandas/indexes/base.pyc in __contains__(self, key)
   1232 
   1233     def __contains__(self, key):
-> 1234         hash(key)
   1235         # work around some kind of odd cython bug
   1236         try:

TypeError: unhashable type
> /usr/local/lib/python2.7/dist-packages/pandas/indexes/base.py(1234)__contains__()
   1232 
   1233     def __contains__(self, key):
(C)-> 1234         hash(key)
   1235         # work around some kind of odd cython bug
   1236         try:

ipdb> 

Now ideally, I would like the debugger to start in the second oldest frame at (B), or even at (A). But definitely not at (C) where it goes by default.

like image 432
user2699 Avatar asked Nov 04 '16 18:11

user2699


People also ask

How to debug the program in Python?

So debugging is a healthier process for the program and keeps the diseases bugs far away. Python also allows developers to debug the programs using pdb module that comes with standard Python by default. We just need to import pdb module in the Python script. Using pdb module, we can set breakpoints in the program to check the current status.

How to print exception stack trace in Python?

How to print exception stack trace in Python? - GeeksforGeeks How to print exception stack trace in Python? To print stack trace for an exception the suspicious code will be kept in the try block and except block will be employed to handle the exception generated.

How do you debug in Python with exclamation points?

Python statements can also be prefixed with an exclamation point (! ). This is a powerful way to inspect the program being debugged; it is even possible to change a variable or call a function. When an exception occurs in such a statement, the exception name is printed but the debugger’s state is not changed.

How to print stack trace entries from a traceback object?

Method 1: By using print_exc () method. This method p rints exception information and stack trace entries from traceback object tb to file. if a limit argument is positive, Print up to limit stack trace entries from traceback object tb (starting from the caller’s frame). Otherwise, print the last abs (limit) entries.


1 Answers

Long answer to document the process for myself. Semi-working solution at the bottom:

Failed attempt here:

import sys
import pdb
import pandas as pd

def test(df):  # (A)
    df[:,0] = 4 #Bad indexing on dataframe, will cause error
    return df

mypdb = pdb.Pdb(skip=['pandas.*'])
mypdb.reset()

df = test(pd.DataFrame(range(3))) # (B) # fails.

mypdb.interaction(None, sys.last_traceback)  # doesn't work.

Pdb skip documentation:

The skip argument, if given, must be an iterable of glob-style module name patterns. The debugger will not step into frames that originate in a module that matches one of these patterns.

Pdb source code:

class Pdb(bdb.Bdb, cmd.Cmd):

    _previous_sigint_handler = None

    def __init__(self, completekey='tab', stdin=None, stdout=None, skip=None,
                 nosigint=False, readrc=True):
        bdb.Bdb.__init__(self, skip=skip)
        [...]

# Post-Mortem interface

def post_mortem(t=None):
    # handling the default
    if t is None:
        # sys.exc_info() returns (type, value, traceback) if an exception is
        # being handled, otherwise it returns None
        t = sys.exc_info()[2]
    if t is None:
        raise ValueError("A valid traceback must be passed if no "
                         "exception is being handled")

    p = Pdb()
    p.reset()
    p.interaction(None, t)

def pm():
    post_mortem(sys.last_traceback)

Bdb source code:

class Bdb:
    """Generic Python debugger base class.
    This class takes care of details of the trace facility;
    a derived class should implement user interaction.
    The standard debugger class (pdb.Pdb) is an example.
    """

    def __init__(self, skip=None):
        self.skip = set(skip) if skip else None
    [...]
    def is_skipped_module(self, module_name):
        for pattern in self.skip:
            if fnmatch.fnmatch(module_name, pattern):
                return True
        return False

    def stop_here(self, frame):
        # (CT) stopframe may now also be None, see dispatch_call.
        # (CT) the former test for None is therefore removed from here.
        if self.skip and \
               self.is_skipped_module(frame.f_globals.get('__name__')):
            return False
        if frame is self.stopframe:
            if self.stoplineno == -1:
                return False
            return frame.f_lineno >= self.stoplineno
        if not self.stopframe:
            return True
        return False

It is clear that the skip list is not used for post-mortems. To fix this I created a custom class which overrides the setup method.

import pdb

class SkipPdb(pdb.Pdb):
    def setup(self, f, tb):
        # This is unchanged
        self.forget()
        self.stack, self.curindex = self.get_stack(f, tb)
        while tb:
            # when setting up post-mortem debugging with a traceback, save all
            # the original line numbers to be displayed along the current line
            # numbers (which can be different, e.g. due to finally clauses)
            lineno = pdb.lasti2lineno(tb.tb_frame.f_code, tb.tb_lasti)
            self.tb_lineno[tb.tb_frame] = lineno
            tb = tb.tb_next

        self.curframe = self.stack[self.curindex][0]
        # This loop is new
        while self.is_skipped_module(self.curframe.f_globals.get('__name__')):
            self.curindex -= 1
            self.stack.pop()
            self.curframe = self.stack[self.curindex][0]
        # The rest is unchanged.
        # The f_locals dictionary is updated from the actual frame
        # locals whenever the .f_locals accessor is called, so we
        # cache it here to ensure that modifications are not overwritten.
        self.curframe_locals = self.curframe.f_locals
        return self.execRcLines()

    def pm(self):
        self.reset()
        self.interaction(None, sys.last_traceback)

If you use this as:

x = 42
df = test(pd.DataFrame(range(3))) # (B) # fails.
# fails. Then do:
mypdb = SkipPdb(skip=['pandas.*'])
mypdb.pm()
>> <ipython-input-36-e420cf1b80b2>(2)<module>()
>-> df = test(pd.DataFrame(range(3))) # (B) # fails.
> (Pdb) l
>  1    x = 42
>  2  ->    df = test(pd.DataFrame(range(3))) # (B) # fails.
> [EOF]

you are dropped into the right frame. Now you just need to figure out how ipython is calling their pdb pm/post_mortem function, and create a similar script. Which appears to be hard, so I pretty much give up here.

Also this is NOT a very great implementation. It assumes that the frames you want to skip are at the top of your stack, and will produce weird results else. E.g. an error in the input function to df.apply will produce something super weird.

TLDR: Not supported by the stdlib, but you can create your own debugger class, but it's nontrivial to get that working with IPythons debugger.

like image 153
Aske Doerge Avatar answered Nov 09 '22 10:11

Aske Doerge