Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How is exception handling implemented in Python?

This question asked for explanations of how exception handling is implemented under the hood in various languages but it did not receive any responses for Python.

I'm especially interested in Python because Python somehow "encourages" exception throwing and catching via the EAFP principle.

I've learned from other SO answers that a try/catch block is cheaper than an if/else statement if the exception is expected to be raised rarely, and that it's the call depth that's important because filling the stacktrace is expensive. This is probably principally true for all programming languages.

What's special about python though is the high priority of the EAFP principle. How are python exceptions therefore implemented internally in the reference implementation (CPython)?

like image 370
Suzana Avatar asked Mar 08 '21 20:03

Suzana


1 Answers

try ... except has some nice documentation in the compiler:

/*
   Code generated for "try: S except E1 as V1: S1 except E2 as V2: S2 ...":
   (The contents of the value stack is shown in [], with the top
   at the right; 'tb' is trace-back info, 'val' the exception's
   associated value, and 'exc' the exception.)
   Value stack          Label   Instruction     Argument
   []                           SETUP_FINALLY   L1
   []                           <code for S>
   []                           POP_BLOCK
   []                           JUMP_FORWARD    L0
   [tb, val, exc]       L1:     DUP                             )
   [tb, val, exc, exc]          <evaluate E1>                   )
   [tb, val, exc, exc, E1]      JUMP_IF_NOT_EXC_MATCH L2        ) only if E1
   [tb, val, exc]               POP
   [tb, val]                    <assign to V1>  (or POP if no V1)
   [tb]                         POP
   []                           <code for S1>
                                JUMP_FORWARD    L0
   [tb, val, exc]       L2:     DUP
   .............................etc.......................
   [tb, val, exc]       Ln+1:   RERAISE     # re-raise exception
   []                   L0:     <next statement>
   Of course, parts are not generated if Vi or Ei is not present.
*/
static int
compiler_try_except(struct compiler *c, stmt_ty s)
{

We have:

  • a SETUP_FINALLY instruction, which presumably registers L1 as the location to jump to when an exception occurs (technically, I'd guess it pushes it on a stack, since the previous value must be restored when our block is done).
  • the code for S, that is, the code inside the try: block.
  • a POP_BLOCK instruction, which cleans stuff up (only reached in the OK case; I'm guessing the VM does it automatically if there's an exception)
  • a JUMP_FORWARD to L0, which is the location of the next instruction (outside the try ... except blocks)

And that's all the bytecode we will run in the OK case. Note that the bytecode doesn't need to actively check for exceptions. Instead, the virtual machine will just automatically jump to L1 in the case of an exception. This is done in ceval.c when executing RAISE_VARARGS.

So what happens at L1? Simply put, we check each except clause in order: does it match the currently raised exception? If it does, we run the code in that except block and jump to L0 (the first instruction outside the try ... except blocks). If not, we check the next except clause, or re-raise the exception if no clause matched.

But let's be more concrete about it. The dis module lets us dump bytecode. So let's create two tiny python files.

One that checks:

tmp$ cat if.py
if type(x) is int:
    x += 1
else:
    print('uh-oh')

...and one that catches:

tmp$ cat try.py
try:
    x += 1
except TypeError as e:
    print('uh-oh')

Now, let's dump their bytecode:

tmp$ python3 -m dis if.py
  1           0 LOAD_NAME                0 (type)
              2 LOAD_NAME                1 (x)
              4 CALL_FUNCTION            1
              6 LOAD_NAME                2 (int)
              8 COMPARE_OP               8 (is)
             10 POP_JUMP_IF_FALSE       22

  2          12 LOAD_NAME                1 (x)
             14 LOAD_CONST               0 (1)
             16 INPLACE_ADD
             18 STORE_NAME               1 (x)
             20 JUMP_FORWARD             8 (to 30)

  4     >>   22 LOAD_NAME                3 (print)
             24 LOAD_CONST               1 ('uh-oh')
             26 CALL_FUNCTION            1
             28 POP_TOP
        >>   30 LOAD_CONST               2 (None)
             32 RETURN_VALUE

For the successful case, this will run 13 instructions (from 0-20 inclusive, then 30 and 32).

tmp$ python3 -m dis try.py 
  1           0 SETUP_EXCEPT            12 (to 14)

  2           2 LOAD_NAME                0 (x)
              4 LOAD_CONST               0 (1)
              6 INPLACE_ADD
              8 STORE_NAME               0 (x)
             10 POP_BLOCK
             12 JUMP_FORWARD            42 (to 56)

  3     >>   14 DUP_TOP
             16 LOAD_NAME                1 (TypeError)
             18 COMPARE_OP              10 (exception match)
             20 POP_JUMP_IF_FALSE       54
             22 POP_TOP
             24 STORE_NAME               2 (e)
             26 POP_TOP
             28 SETUP_FINALLY           14 (to 44)

  4          30 LOAD_NAME                3 (print)
             32 LOAD_CONST               1 ('uh-oh')
             34 CALL_FUNCTION            1
             36 POP_TOP
             38 POP_BLOCK
             40 POP_EXCEPT
             42 LOAD_CONST               2 (None)
        >>   44 LOAD_CONST               2 (None)
             46 STORE_NAME               2 (e)
             48 DELETE_NAME              2 (e)
             50 END_FINALLY
             52 JUMP_FORWARD             2 (to 56)
        >>   54 END_FINALLY
        >>   56 LOAD_CONST               2 (None)
             58 RETURN_VALUE

For the successful case, this will run 9 instructions (0-12 inclusive, then 56 and 58).

Now, instruction count is far from a perfect measure of time taken (especially in a bytecode vm, where instructions can vary wildly in cost), but there it is.

Finally, let's look at how CPython does that "automatic" jump to L1. As I wrote earlier, it happens as part of the execution of RAISE_VARARGS:

    case TARGET(RAISE_VARARGS): {
        PyObject *cause = NULL, *exc = NULL;
        switch (oparg) {
        case 2:
            cause = POP(); /* cause */
            /* fall through */
        case 1:
            exc = POP(); /* exc */
            /* fall through */
        case 0:
            if (do_raise(tstate, exc, cause)) {
                goto exception_unwind;
            }
            break;
        default:
            _PyErr_SetString(tstate, PyExc_SystemError,
                             "bad RAISE_VARARGS oparg");
            break;
        }
        goto error;
    }

[...]

exception_unwind:
    f->f_state = FRAME_UNWINDING;
    /* Unwind stacks if an exception occurred */
    while (f->f_iblock > 0) {
        /* Pop the current block. */
        PyTryBlock *b = &f->f_blockstack[--f->f_iblock];

        if (b->b_type == EXCEPT_HANDLER) {
            UNWIND_EXCEPT_HANDLER(b);
            continue;
        }
        UNWIND_BLOCK(b);
        if (b->b_type == SETUP_FINALLY) {
            PyObject *exc, *val, *tb;
            int handler = b->b_handler;
            _PyErr_StackItem *exc_info = tstate->exc_info;
            /* Beware, this invalidates all b->b_* fields */
            PyFrame_BlockSetup(f, EXCEPT_HANDLER, f->f_lasti, STACK_LEVEL());
            PUSH(exc_info->exc_traceback);
            PUSH(exc_info->exc_value);
            if (exc_info->exc_type != NULL) {
                PUSH(exc_info->exc_type);
            }
            else {
                Py_INCREF(Py_None);
                PUSH(Py_None);
            }
            _PyErr_Fetch(tstate, &exc, &val, &tb);
            /* Make the raw exception data
               available to the handler,
               so a program can emulate the
               Python main loop. */
            _PyErr_NormalizeException(tstate, &exc, &val, &tb);
            if (tb != NULL)
                PyException_SetTraceback(val, tb);
            else
                PyException_SetTraceback(val, Py_None);
            Py_INCREF(exc);
            exc_info->exc_type = exc;
            Py_INCREF(val);
            exc_info->exc_value = val;
            exc_info->exc_traceback = tb;
            if (tb == NULL)
                tb = Py_None;
            Py_INCREF(tb);
            PUSH(tb);
            PUSH(val);
            PUSH(exc);
            JUMPTO(handler);
            if (_Py_TracingPossible(ceval2)) {
                trace_info.instr_prev = INT_MAX;
            }
            /* Resume normal execution */
            f->f_state = FRAME_EXECUTING;
            goto main_loop;
        }
    } /* unwind stack */

The interesting part is the JUMPTO(handler) line. The handler value comes from b->b_handler, which in turn was set by the SETUP_FINALLY instruction. And with that, I think we've come full circle! Whew!

like image 191
Snild Dolkow Avatar answered Sep 30 '22 04:09

Snild Dolkow