Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient way to determine whether a particular function is on the stack in Python

For debugging, it is often useful to tell if a particular function is higher up on the call stack. For example, we often only want to run debugging code when a certain function called us.

One solution is to examine all of the stack entries higher up, but it this is in a function that is deep in the stack and repeatedly called, this leads to excessive overhead. The question is to find a method that allows us to determine if a particular function is higher up on the call stack in a way that is reasonably efficient.

Similar

  • Obtaining references to function objects on the execution stack from the frame object? - This question focuses on obtaining the function objects, rather than determining if we are in a particular function. Although the same techniques could be applied, they may end up being extremely inefficient.
like image 370
Casebash Avatar asked Sep 10 '09 05:09

Casebash


2 Answers

Unless the function you're aiming for does something very special to mark "one instance of me is active on the stack" (IOW: if the function is pristine and untouchable and can't possibly be made aware of this peculiar need of yours), there is no conceivable alternative to walking frame by frame up the stack until you hit either the top (and the function is not there) or a stack frame for your function of interest. As several comments to the question indicate, it's extremely doubtful whether it's worth striving to optimize this. But, assuming for the sake of argument that it was worthwhile...:

Edit: the original answer (by the OP) had many defects, but some have since been fixed, so I'm editing to reflect the current situation and why certain aspects are important.

First of all, it's crucial to use try/except, or with, in the decorator, so that ANY exit from a function being monitored is properly accounted for, not just normal ones (as the original version of the OP's own answer did).

Second, every decorator should ensure it keeps the decorated function's __name__ and __doc__ intact -- that's what functools.wraps is for (there are other ways, but wraps makes it simplest).

Third, just as crucial as the first point, a set, which was the data structure originally chosen by the OP, is the wrong choice: a function can be on the stack several times (direct or indirect recursion). We clearly need a "multi-set" (also known as "bag"), a set-like structure which keeps track of "how many times" each item is present. In Python, the natural implementation of a multiset is as a dict mapping keys to counts, which in turn is most handily implemented as a collections.defaultdict(int).

Fourth, a general approach should be threadsafe (when that can be accomplished easily, at least;-). Fortunately, threading.local makes it trivial, when applicable -- and here, it should surely be (each stack having its own separate thread of calls).

Fifth, an interesting issue that has been broached in some comments (noticing how badly the offered decorators in some answers play with other decorators: the monitoring decorator appears to have to be the LAST (outermost) one, otherwise the checking breaks. This comes from the natural but unfortunate choice of using the function object itself as the key into the monitoring dict.

I propose to solve this by a different choice of key: make the decorator take a (string, say) identifier argument that must be unique (in each given thread) and use the identifier as the key into the monitoring dict. The code checking the stack must of course be aware of the identifier and use it as well.

At decorating time, the decorator can check for the uniqueness property (by using a separate set). The identifier may be left to default to the function name (so it's only explicitly required to keep the flexibility of monitoring homonymous functions in the same namespace); the uniqueness property may be explicitly renounced when several monitored functions are to be considered "the same" for monitoring purposes (this may be the case if a given def statement is meant to be executed multiple times in slightly different contexts to make several function objects that the programmers wants to consider "the same function" for monitoring purposes). Finally, it should be possible to optionally revert to the "function object as identifier" for those rare cases in which further decoration is KNOWN to be impossible (since in those cases it may be the handiest way to guarantee uniqueness).

So, putting these many considerations together, we could have (including a threadlocal_var utility function that will probably already be in a toolbox module of course;-) something like the following...:

import collections
import functools
import threading

threadlocal = threading.local()

def threadlocal_var(varname, factory, *a, **k):
  v = getattr(threadlocal, varname, None)
  if v is None:
    v = factory(*a, **k)
    setattr(threadlocal, varname, v)
  return v

def monitoring(identifier=None, unique=True, use_function=False):
  def inner(f):
    assert (not use_function) or (identifier is None)
    if identifier is None:
      if use_function:
        identifier = f
      else:
        identifier = f.__name__
    if unique:
      monitored = threadlocal_var('uniques', set)
      if identifier in monitored:
        raise ValueError('Duplicate monitoring identifier %r' % identifier)
      monitored.add(identifier)
    counts = threadlocal_var('counts', collections.defaultdict, int)
    @functools.wraps(f)
    def wrapper(*a, **k):
      counts[identifier] += 1
      try:
        return f(*a, **k)
      finally:
        counts[identifier] -= 1
    return wrapper
  return inner

I have not tested this code, so it might contain some typo or the like, but I'm offering it because I hope it does cover all the important technical points I explained above.

Is it all worth it? Probably not, as previously explained. However, I think along the lines of "if it's worth doing at all, then it's worth doing right";-).

like image 153
Alex Martelli Avatar answered Oct 14 '22 21:10

Alex Martelli


I don't really like this approach, but here's a fixed-up version of what you were doing:

from collections import defaultdict
import threading
functions_on_stack = threading.local()

def record_function_on_stack(f):
    def wrapped(*args, **kwargs):
        if not getattr(functions_on_stack, "stacks", None):
            functions_on_stack.stacks = defaultdict(int)
        functions_on_stack.stacks[wrapped] += 1

        try:
            result = f(*args, **kwargs)
        finally:
            functions_on_stack.stacks[wrapped] -= 1
            if functions_on_stack.stacks[wrapped] == 0:
                del functions_on_stack.stacks[wrapped]
        return result

    wrapped.orig_func = f
    return wrapped

def function_is_on_stack(f):
    return f in functions_on_stack.stacks

def nested():
    if function_is_on_stack(test):
        print "nested"

@record_function_on_stack
def test():
    nested()

test()

This handles recursion, threading and exceptions.

I don't like this approach for two reasons:

  • It doesn't work if the function is decorated further: this must be the final decorator.
  • If you're using this for debugging, it means you have to edit code in two places to use it; one to add the decorator, and one to use it. It's much more convenient to just examine the stack, so you only have to edit code in the code you're debugging.

A better approach would be to examine the stack directly (possibly as a native extension for speed), and if possible, find a way to cache the results for the lifetime of the stack frame. (I'm not sure if that's possible without modifying the Python core, though.)

like image 28
Glenn Maynard Avatar answered Oct 14 '22 21:10

Glenn Maynard