Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I restore a function whose closure contains cycles in Python?

I'm trying to serialise Python functions (code + closures), and reinstate them later on. I'm using the code at the bottom of this post.

This is very flexible code. It allows the serialisation and deserialisation of inner functions, and functions that are closures, such as those that need their context to be reinstated:

def f1(arg):
    def f2():
        print arg

    def f3():
        print arg
        f2()

    return f3

x = SerialiseFunction(f1(stuff)) # a string
save(x) # save it somewhere

# later, possibly in a different process

x = load() # get it from somewhere 
newf2 = DeserialiseFunction(x)
newf2() # prints value of "stuff" twice

These calls will work even if there are functions in the closure of your function, functions in their closures, and so on (we have a graph of closures, where closures contain functions that have closures that contain more functions and so on).

However, it turns out that these graphs can contain cycles:

def g1():
    def g2():
        g2()
    return g2()

g = g1()

If I look at g2's closure (via g), I can see g2 in it:

>>> g
<function g2 at 0x952033c>
>>> g.func_closure[0].cell_contents
<function g2 at 0x952033c>

This causes a serious problem when I try to deserialise the function, because everything is immutable. What I need to do is to make the function newg2:

newg2 = types.FunctionType(g2code, globals, closure=newg2closure)

where newg2closure is created as follows:

newg2closure = (make_cell(newg2),)

which of course can't be done; each line of code relies on the other. Cells are immutable, tuples are immutable, function types are immutable.

So what I'm trying to find out is, is there a way to create newg2 above? Is there some way I can create a function type object where that object is mentioned in its own closure graph?

I'm using python 2.7 (I'm on App Engine, so I can't go to Python 3).


For reference, my serialisation functions:

def SerialiseFunction(aFunction):
    if not aFunction or not isinstance(c, types.FunctionType):
        raise Exception ("First argument required, must be a function")

    def MarshalClosureValues(aClosure):
        logging.debug(repr(aClosure))
        lmarshalledClosureValues = []
        if aClosure:
            lclosureValues = [lcell.cell_contents for lcell in aClosure]
            lmarshalledClosureValues = [
                [marshal.dumps(litem.func_code), MarshalClosureValues(litem.func_closure)] if hasattr(litem, "func_code")
                else [marshal.dumps(litem)] 
                for litem in lclosureValues
            ]
        return lmarshalledClosureValues

    lmarshalledFunc = marshal.dumps(aFunction.func_code)
    lmarshalledClosureValues = MarshalClosureValues(aFunction.func_closure)
    lmoduleName = aFunction.__module__

    lcombined = (lmarshalledFunc, lmarshalledClosureValues, lmoduleName)

    retval = marshal.dumps(lcombined)

    return retval


def DeserialiseFunction(aSerialisedFunction):
    lmarshalledFunc, lmarshalledClosureValues, lmoduleName = marshal.loads(aSerialisedFunction)

    lglobals = sys.modules[lmoduleName].__dict__

    def make_cell(value):
        return (lambda x: lambda: x)(value).func_closure[0]

    def UnmarshalClosureValues(aMarshalledClosureValues):
        lclosure = None
        if aMarshalledClosureValues:
            lclosureValues = [
                    marshal.loads(item[0]) if len(item) == 1 
                    else types.FunctionType(marshal.loads(item[0]), lglobals, closure=UnmarshalClosureValues(item[1])) 
                    for item in aMarshalledClosureValues if len(item) >= 1 and len(item) <= 2
                ]
            lclosure = tuple([make_cell(lvalue) for lvalue in lclosureValues])
        return lclosure

    lfunctionCode = marshal.loads(lmarshalledFunc)
    lclosure = UnmarshalClosureValues(lmarshalledClosureValues)
    lfunction = types.FunctionType(lfunctionCode, lglobals, closure=lclosure)
    return lfunction
like image 714
Emlyn O'Regan Avatar asked Oct 08 '14 07:10

Emlyn O'Regan


People also ask

Are Closures Pythonic?

Understanding what, when & why to use closures!Closures are elegant Python constructs. In this article, we'll learn about them, how to define a closure, why and when to use them. But before getting into what a closure is, we have to first understand what a nested function is and how scoping rules work for them.

Are Closures supported in Python?

Python offers support for closures too. In this article, we'll learn what closures in Python are, how to define them and lastly, when, and why you should use them.

How do Python Closures work?

A closure is a nested function which has access to a free variable from an enclosing function that has finished its execution. Three characteristics of a Python closure are: it is a nested function. it has access to a free variable in outer scope.

How can I see the details of a function in Python?

Python help() function is used to get the documentation of specified module, class, function, variables etc. This method is generally used with python interpreter console to get details about python objects.


1 Answers

Here's a method that works.

You can't fix these immutable objects, but what you can do is stick proxy functions in place of circular references, and have them look up the real function in a global dictionary.

1: When serialising, keep track of all the functions you've seen. If you see the same one again, don't reserialise, instead serialise a sentinel value.

I've used a set:

lfunctionHashes = set()

and for each serialised item, check if it's in the set, go with a sentinel if so, otherwise add it to the set and marshal properly:

lhash = hash(litem)
if lhash in lfunctionHashes:
    lmarshalledClosureValues.append([lhash, None])
else:
    lfunctionHashes.add(lhash)
    lmarshalledClosureValues.append([lhash, marshal.dumps(litem.func_code), MarshalClosureValues(litem.func_closure, lfullIndex), litem.__module__])

2: When deserialising, keep a global dict of functionhash: function

gfunctions = {}

During deserialisation, any time you deserialise a function, add it to gfunctions. Here, item is (hash, code, closurevalues, modulename):

lfunction = types.FunctionType(marshal.loads(item[1]), globals, closure=UnmarshalClosureValues(item[2]))
gfunctions[item[0]] = lfunction

And when you come across the sentinel value for a function, use the proxy, passing in the hash of the function:

lfunction = make_proxy(item[0])

Here's the proxy. It looks up the real function based on the hash:

def make_proxy(f_hash):
    def f_proxy(*args, **kwargs):
        global gfunctions
        f = lfunctions[f_hash]
        f(*args, **kwargs)

    return f_proxy

I've also had to make a few other changes:

  • I've used pickle instead of marshal in some places, might examine this further
  • I'm including the module name of the function in serialisation as well as code and closure, so I can look up the correct globals for the function when deserialising.
  • In the deserialisation, the length of the tuple tells you what you're deserialising: 1 for a simple value, 2 for a function which needs proxying, 4 for a fully serialised function

Here's the full new code.

lfunctions = {}

def DeserialiseFunction(aSerialisedFunction):
    lmarshalledFunc, lmarshalledClosureValues, lmoduleName = pickle.loads(aSerialisedFunction)

    lglobals = sys.modules[lmoduleName].__dict__
    lglobals["lfunctions"] = lfunctions

    def make_proxy(f_hash):
        def f_proxy(*args, **kwargs):
            global lfunctions
            f = lfunctions[f_hash]
            f(*args, **kwargs)

        return f_proxy

    def make_cell(value):
        return (lambda x: lambda: x)(value).func_closure[0]

    def UnmarshalClosureValues(aMarshalledClosureValues):
        global lfunctions

        lclosure = None
        if aMarshalledClosureValues:
            lclosureValues = []
            for item in aMarshalledClosureValues:
                ltype = len(item)
                if ltype == 1:
                    lclosureValues.append(pickle.loads(item[0]))
                elif ltype == 2:
                    lfunction = make_proxy(item[0])
                    lclosureValues.append(lfunction)
                elif ltype == 4:
                    lfuncglobals = sys.modules[item[3]].__dict__
                    lfuncglobals["lfunctions"] = lfunctions
                    lfunction = types.FunctionType(marshal.loads(item[1]), lfuncglobals, closure=UnmarshalClosureValues(item[2]))
                    lfunctions[item[0]] = lfunction
                    lclosureValues.append(lfunction)
            lclosure = tuple([make_cell(lvalue) for lvalue in lclosureValues])
        return lclosure

    lfunctionCode = marshal.loads(lmarshalledFunc)
    lclosure = UnmarshalClosureValues(lmarshalledClosureValues)
    lfunction = types.FunctionType(lfunctionCode, lglobals, closure=lclosure)
    return lfunction

def SerialiseFunction(aFunction):
    if not aFunction or not hasattr(aFunction, "func_code"):
        raise Exception ("First argument required, must be a function")

    lfunctionHashes = set()

    def MarshalClosureValues(aClosure, aParentIndices = []):
        lmarshalledClosureValues = []
        if aClosure:
            lclosureValues = [lcell.cell_contents for lcell in aClosure]

            lmarshalledClosureValues = []
            for index, litem in enumerate(lclosureValues):
                lfullIndex = list(aParentIndices)
                lfullIndex.append(index)

                if isinstance(litem, types.FunctionType):
                    lhash = hash(litem)
                    if lhash in lfunctionHashes:
                        lmarshalledClosureValues.append([lhash, None])
                    else:
                        lfunctionHashes.add(lhash)
                        lmarshalledClosureValues.append([lhash, marshal.dumps(litem.func_code), MarshalClosureValues(litem.func_closure, lfullIndex), litem.__module__])
                else:
                    lmarshalledClosureValues.append([pickle.dumps(litem)])

    lmarshalledFunc = marshal.dumps(aFunction.func_code)
    lmarshalledClosureValues = MarshalClosureValues(aFunction.func_closure)
    lmoduleName = aFunction.__module__

    lcombined = (lmarshalledFunc, lmarshalledClosureValues, lmoduleName)

    retval = pickle.dumps(lcombined)

    return retval
like image 87
Emlyn O'Regan Avatar answered Oct 06 '22 13:10

Emlyn O'Regan