Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Inlining Python Function

In a C program, inlining a function is a fairly intuitive optimization. If the inlined function's body is sufficiently small, you end up saving the jump to the function and creation of the stack frame, and you store the return value wherever the function's result would have been stored, jumping to the end of the inlined function's "body" rather than long-jumping to the return pointer.

I'm interested in doing the same thing in Python, converting two python functions into another valid python function where the first got "inlined" into the second. An ideal solution to this might look something like the following:

def g(x):
    return x ** 2

def f(y):
    return g(y + 3)

# ... Becomes ...

def inlined_f(y):
    return (y + 3) ** 2

Clearly, in a language as dynamic as Python, this isn't trivial to do automatically. The best generic solution I have come up with is to use dict to capture the arguments passed to the function, wrap the function body in a one-iteration for loop, use break to jump to the end of the function, and replace uses of arguments with indexes into the argument dictionary. The result looks something like the following:

def inlined_f(y):
    _g = dict(x=y + 3)
    for ____ in [None]:
        _g['return'] = _g['x'] ** 2
        break
    _g_return = _g.get('return', None)
    del _g
    return _g_return

I don't care that it's ugly, but I do care that it doesn't support returns from within loops. E.g.:

def g(x):
    for i in range(x + 1):
        if i == x:
            return i ** 2
    print("Woops, you shouldn't get here")

def inlined_f(y):
    _g = dict(x=y + 3)
    for ____ in [None]:
        for _g['i'] in range(_g['x'] + 1):
            if _g['i'] == _g['x']:
                _g['return'] _g['i'] ** 2
                break  # <-- Doesn't exit function, just innermost loop
        print("Woops, you shouldn't get here")
    _g_return = _g.get('return', None)
    del _g
    return _g_return

What approach could I take to this problem that avoids needing to use break to "jump" out of the inlined function's body? I'd also be open to an overall better, generic approach could I take to inline one Python function into another.

For reference, I'm working at the AST (abstract syntax tree) level, so using parsed Python code; clearly, outside of literal values, I don't know what value or type anything will have while performing this transformation. The resulting inlined function must behave identically to the original functions, and must support all features typically available when calling a function. Is this even possible in Python?


EDIT: I should clarify since I used the tag "optimization", that I'm not actually interested in a performance boost. The resulting code does not need to be faster, it just must not call the inlined function while still behaving identically. You can assume that both functions' source code is available as valid Python.

like image 834
scnerd Avatar asked May 02 '18 19:05

scnerd


People also ask

What does inlining a function do?

An inline function is one for which the compiler copies the code from the function definition directly into the code of the calling function rather than creating a separate set of instructions in memory. This eliminates call-linkage overhead and can expose significant optimization opportunities.

What is code inlining?

Inlining is the process of replacing a subroutine or function call at the call site with the body of the subroutine or function being called. This eliminates call-linkage overhead and can expose significant optimization opportunities.

Does inlining improve performance?

inline functions might make it faster: As shown above, procedural integration might remove a bunch of unnecessary instructions, which might make things run faster. inline functions might make it slower: Too much inlining might cause code bloat, which might cause “thrashing” on demand-paged virtual-memory systems.

Are there inline functions in Python?

Python lambda functions, also known as anonymous functions, are inline functions that do not have a name. They are created with the lambda keyword. This is part of the functional paradigm built-in Python. Python lambda functions are restricted to a single expression.


2 Answers

The only reasonable way on source level I see, simplified:

  • Parse the source into some AST (or just use the built-in AST).
  • Copy a subtree representing the function's body.
  • Rename the variables in the subtree, e.g. by adding an unique prefix.
  • At the call site, replace all passed arguments with assignments using the function's new variable names.
  • Remove the call and replace it with the function body you've prepared.
  • Serialize the AST back to source.

What poses real problems:

  • Generator functions; just don't inline them.
  • Returns from under try/finally that need to run the finally part. Might be pretty hard to rewrite correctly; imho, best left in-unlined.
  • Returns from under context managers that need to run the __exit__ parts. While not impossible, it's also tricky to rewrite preserving the semantics; likely also best left un-inlined.
  • Mid-function returns, especially from within multiple loop constructs. You might need to replace them with an extra variable and thread it into every condition of every while statement, and likely to add a conditional break to for statements. Again, not impossible but likely best left un-inlined.
like image 81
9000 Avatar answered Sep 25 '22 14:09

9000


Probably the closest analog to a return would be raising an Exception, which would work to pop out of nested loops to the top of the "inlined function".

class ReturnException(Exception):
    pass


g = dict(x=y + 3)
try:
    for j in some_loop:
        for _g['i'] in range(_g['x'] + 1):
            if _g['i'] == _g['x']:
                raise ReturnException(_g['i'] ** 2)
except ReturnException as e:
    _g['return'] = e.message
else:
    _g['return'] = None

I don't know how much overhead is associated with exceptions though or if that would be faster than simply calling the function.

like image 20
Brendan Abel Avatar answered Sep 22 '22 14:09

Brendan Abel