So I am curious about the views of more experienced python programmers on the following style question. Suppose that I am building a function that is going to iterate row by row through a pandas dataframe or any similar use-case where a function requires access to its previous state. There seem to be at least four ways to implement this in python: <ol> <li>Closures:</li> </ol> <pre class="prettyprint lang-py prettyprint-override"><code>def outer(): previous_state = None def inner(current_state) : nonlocal previous_state #do something previous_state=current_state return something </code></pre> So if you come from a javascript background this will doubtless seem natural to you. It feels pretty natural in python too, right up until you need to access the enclosing scope when you will end up doing something like <code>inner.__code__.co_freevars</code>, which will give you the names of your enclosing variables as a tuple, and finding the index of the one you want, and then going to <code>inner.__closure__[index].cell_contents</code> to get its value. Not exactly elegant, but I suppose the point is often to hide scope, so it makes sense that it should be hard to reach. On the other hand, it also feels a bit weird that python makes the enclosing function private when it has done away with almost every other way to have a private variable compared to OOP languages. <ol start="2"> <li>Functor</li> </ol> <pre class="prettyprint lang-py prettyprint-override"><code>def outer(): def inner(current_state): #do something inner.previous_state=current_state return something ret = inner ret.previous_state=None return ret </code></pre> This "opens the closure" in that now the enclosing state is fully visible as an attribute of the function. This works because functions are really just objects in disguise. I am leaning towards this as the most pythonic. Its clear, concise, and readable. <ol start="3"> <li>Objects This is probably the most familiar to OOP programmers</li> </ol> <pre class="prettyprint lang-py prettyprint-override"><code>class Calculator(Object) : def __init__(self): self.previous_state=None def do_something(self, current_state) : #do_something self.previous_state = current_state return something </code></pre> The biggest con here is that you tend to end up with a lot of class definitions. That is fine in a fully OOP language like Java where you have interfaces and the like to manage this, but it seems a bit odd in a duck typed language to have many simple classes just to carry around a function that needs a bit of state. <ol start="4"> <li> globals - I won't demonstrate this as I specifically want to avoid polluting the global namespace </li> <li> Decorators - this is a little bit of a curveball, but you can use decorators to store partial state information. </li> </ol> <pre class="prettyprint lang-py prettyprint-override"><code>@outer def inner(previous_state, current_state): #do something return something def outer(inner) : def wrapper(current_state) : result = inner(wrapper.previous_state, current_state) wrapper.previous_state = current_state return result ret = wrapper ret.previous_state=None return result </code></pre> This kind of syntax is the least familiar to me, but if I now call <pre class="prettyprint"><code>func = inner </code></pre> I actually get <pre class="prettyprint"><code>func = outer(inner) </code></pre> and then repeatedly calling <code>func()</code> acts just like the functor example. I actually really hate this way the most. It seems to me to have a really non transparent syntax in that it isn't clear if calling inner(current_state) lots of times will give you the same result or if it will give you a newly decorated function every time, so it seems like bad practice to make decorators which add state to a function in this way. So which is the correct way? What pros and cons have I missed here?

So the correct answer to this is the callable object, which essentially replaces the idiom of the closure in python. so working off option 3 above change: <pre class="prettyprint"><code>class Calculator(Object) : def __init__(self): self.previous_state=None def do_something(self, current_state) : #do_something self.previous_state = current_state return something </code></pre> to <pre class="prettyprint"><code>class Calculator(Object) : def __init__(self): self.previous_state=None def __call__(self, current_state) : #do_something self.previous_state = current_state return something </code></pre> and now you can call it like a function. So <pre class="prettyprint"><code>func = Calculator(): for x in list: func(x) </code></pre>

Idioms in python: closure vs functor vs object

Tags:

python

So I am curious about the views of more experienced python programmers on the following style question. Suppose that I am building a function that is going to iterate row by row through a pandas dataframe or any similar use-case where a function requires access to its previous state. There seem to be at least four ways to implement this in python:

Closures:

def outer():
    previous_state = None
    def inner(current_state) :
        nonlocal previous_state
        #do something
        previous_state=current_state
        return something

So if you come from a javascript background this will doubtless seem natural to you. It feels pretty natural in python too, right up until you need to access the enclosing scope when you will end up doing something like inner.__code__.co_freevars, which will give you the names of your enclosing variables as a tuple, and finding the index of the one you want, and then going to inner.__closure__[index].cell_contents to get its value. Not exactly elegant, but I suppose the point is often to hide scope, so it makes sense that it should be hard to reach. On the other hand, it also feels a bit weird that python makes the enclosing function private when it has done away with almost every other way to have a private variable compared to OOP languages.

Functor

def outer():
    def inner(current_state):
        #do something
        inner.previous_state=current_state
        return something
    ret = inner
    ret.previous_state=None
    return ret

This "opens the closure" in that now the enclosing state is fully visible as an attribute of the function. This works because functions are really just objects in disguise. I am leaning towards this as the most pythonic. Its clear, concise, and readable.

Objects This is probably the most familiar to OOP programmers

class Calculator(Object) :
    def __init__(self):
        self.previous_state=None

    def do_something(self, current_state) :
        #do_something
        self.previous_state = current_state
        return something

The biggest con here is that you tend to end up with a lot of class definitions. That is fine in a fully OOP language like Java where you have interfaces and the like to manage this, but it seems a bit odd in a duck typed language to have many simple classes just to carry around a function that needs a bit of state.

globals - I won't demonstrate this as I specifically want to avoid polluting the global namespace
Decorators - this is a little bit of a curveball, but you can use decorators to store partial state information.

@outer
def inner(previous_state, current_state):
    #do something
    return something

def outer(inner) :
    def wrapper(current_state) :
        result =  inner(wrapper.previous_state, current_state)
        wrapper.previous_state = current_state
        return result
    ret = wrapper
    ret.previous_state=None
    return result

This kind of syntax is the least familiar to me, but if I now call

func = inner

I actually get

func = outer(inner)

and then repeatedly calling func() acts just like the functor example. I actually really hate this way the most. It seems to me to have a really non transparent syntax in that it isn't clear if calling inner(current_state) lots of times will give you the same result or if it will give you a newly decorated function every time, so it seems like bad practice to make decorators which add state to a function in this way.

So which is the correct way? What pros and cons have I missed here?

331

asked Dec 16 '15 16:12

phil_20686

2 Answers

So the correct answer to this is the callable object, which essentially replaces the idiom of the closure in python.

so working off option 3 above change:

class Calculator(Object) :
    def __init__(self):
        self.previous_state=None

    def do_something(self, current_state) :
        #do_something
        self.previous_state = current_state
        return something

class Calculator(Object) :
    def __init__(self):
        self.previous_state=None

    def __call__(self, current_state) :
        #do_something
        self.previous_state = current_state
        return something

and now you can call it like a function. So

func = Calculator():
for x in list:
    func(x)

answered Sep 24 '22 13:09

phil_20686

You can define a generator, which is a restricted form of a coprocess.

def make_gen():
    previous_state = None
    for row in rows:
        # do something
        previous_state = current_state
        yield something

thing = make_gen()
for item in thing:
    # Each iteration, item is a different value
    # "returned" by the yield statement in the generator

Instead of calling thing (which replaces your inner function) repeatedly, you iterate over it (which is basically the same as calling next(thing) repeatedly).

The state is entirely contained within the body of the generator.

If you don't want to actually iterate over it, you can still selectively "re-enter" the coprocess by calling next explicitly.

thing = make_gen()
first_item = next(thing)
# do some stuff
second_item = next(thing)
# do more stuff
third_item = next(thing)
fourth_item = next(thing)
# etc

answered Sep 22 '22 13:09

chepner

Related questions
                            
                                Django: ValueError: Lookup failed for model referenced by field account.UserProfile.user: auth.User
                            
                                gdb pretty printing with python a recursive structure
                            
                                How to prevent Exception ignored in: <module 'threading' from ... > while setting signal handler?
                            
                                How to detect if python script is being run as a background process
                            
                                A python function that accepts as an argument either a scalar or a numpy array
                            
                                python lockf and flock behaviour
                            
                                Python3.x how to share a database connection between processes?
                            
                                Interpolating data from a look up table
                            
                                Change title of Tkinter application in OS X Menu Bar
                            
                                TemplateDoesNotExist at / base.html
                            
                                matplotlib on pycharm with remote ssh intepreter
                            
                                Memory consumption of NumPy function for standard deviation
                            
                                python mock and libraries that are not installed
                            
                                Does multiprocessing.pool.imap has a variant (like starmap) that allows for multiple arguments?
                            
                                Can you fix the false negative rate in a classifier in scikit learn
                            
                                How do I download Anaconda packages without "installing" them?
                            
                                Compiling & installing C executable using python's setuptools/setup.py?
                            
                                How are variables names stored and mapped internally?
                            
                                import m2m relation in django-import-export
                            
                                How do I fix a dimension error in TensorFlow?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Idioms in python: closure vs functor vs object

Tags:

python

phil_20686

People also ask

2 Answers

phil_20686

chepner

Recent Activity

Donate For Us