I have a class that solves an exact cover problem using a recursive, backtracking algorithm. Originally, I implemented the class with a callback function I passed to the object during initialization. This callback is invoked whenever a solution is found. In looking at someone else's implementation of the same problem, I saw that they were using yield statements to pass a solution out, in other words, their code was a python generator. I thought this was an interesting idea so I made a new version of my class to use yields. I then ran comparison tests between the two versions and, to my surprise, I found the generator version ran 5 times slower than the callback version. Note that, except for switching in a yield for a callback, the code is identical. What is going on here? I'm speculating that, because a generator needs to save state information before yielding and then restore that state when restarting at the next call, it is this save/restore that is what makes the generator version run so much slower. If this is the case, how much state information is the generator having to save and restore? Any ideas from the python experts? --Edited 7:40 PDT Here is the solver code which uses yield. Replace the first yield below with a call to the callback function and change the following loop with the second yield to just a recursive call to solve for the original version of this code. <pre class="prettyprint"><code> def solve(self): for tp in self.pieces: if self.inuse[tp.name]: continue self.inuse[tp.name] = True while tp.next_orientation() is not None: if tp.insert_piece(): self.n_trials += 1 self.pieces_in += 1 self.free_cells -= tp.size if self.pieces_in == len(self.pieces) or self.free_cells == 0: self.solutions += 1 self.haveSolution = True yield True self.haveSolution = False else: self.table.next_base_square() for tf in self.solve(): yield tf tp.remove_piece() self.pieces_in -= 1 self.table.set_base_square(tp.base_square) self.free_cells += tp.size self.inuse[tp.name] = False tp.reset_orientation() </code></pre> The mail loop which invokes the solver (after initialization, of course) is <pre class="prettyprint"><code> start_time = time.time() for tf in s.solve(): printit(s) end_time = time.time() delta_time = end_time - start_time </code></pre> In the callback version, the loop is gone with just a single call to solve.

What i meant in my comment, ("yielding from a recursive function sounds like it requires extra for loops to pass the results down to the caller") is this line: <pre class="prettyprint"><code> for tf in self.solve(): yield tf </code></pre> These lines recursively loop over the results from the deeper recursion stages. That means that a single result is iterated over on each level of the recursion, resulting in a lot of unnecessary looping. Let me illustrate with this example: <pre class="prettyprint"><code>n = 0 def rekurse(z): global n if z: yield z for x in rekurse(z-1): n += 1 yield x print list(rekurse(10)) print n </code></pre> As you can see this simply counts down from 10, so you'd expect a a linear number of iterations. What you can see though is that <code>n</code> grows quadratically - <code>recurse(10)</code> loops over 9 items, <code>recurse(9)</code> over 8 items and so on. The more items you have, the more time Python spends on these simple lines. Callbacks completely avoid that problem, so I'd suspect that is the problem with your code. A optimized implementation of PEP 380 could fix this (see this paragraph). In the meantime I don't think it's a good idea to yield from recursive functions (at least if they recurse deeply), they just don't work well together.

Python generator vs callback function

Tags:

python

generator

profiling

I have a class that solves an exact cover problem using a recursive, backtracking algorithm. Originally, I implemented the class with a callback function I passed to the object during initialization. This callback is invoked whenever a solution is found. In looking at someone else's implementation of the same problem, I saw that they were using yield statements to pass a solution out, in other words, their code was a python generator. I thought this was an interesting idea so I made a new version of my class to use yields. I then ran comparison tests between the two versions and, to my surprise, I found the generator version ran 5 times slower than the callback version. Note that, except for switching in a yield for a callback, the code is identical.

What is going on here? I'm speculating that, because a generator needs to save state information before yielding and then restore that state when restarting at the next call, it is this save/restore that is what makes the generator version run so much slower. If this is the case, how much state information is the generator having to save and restore?

Any ideas from the python experts?

--Edited 7:40 PDT

Here is the solver code which uses yield. Replace the first yield below with a call to the callback function and change the following loop with the second yield to just a recursive call to solve for the original version of this code.

Click to copy

   def solve(self):
      for tp in self.pieces:
         if self.inuse[tp.name]: continue

         self.inuse[tp.name] = True
         while tp.next_orientation() is not None:
            if tp.insert_piece():
               self.n_trials += 1
               self.pieces_in += 1
               self.free_cells -= tp.size

               if self.pieces_in == len(self.pieces) or self.free_cells == 0:
                  self.solutions += 1
                  self.haveSolution = True
                  yield True
                  self.haveSolution = False
               else:
                  self.table.next_base_square()
                  for tf in self.solve():
                     yield tf

               tp.remove_piece()
               self.pieces_in -= 1
               self.table.set_base_square(tp.base_square)
               self.free_cells += tp.size

         self.inuse[tp.name] = False
         tp.reset_orientation()

The mail loop which invokes the solver (after initialization, of course) is

Click to copy

   start_time = time.time()
   for tf in s.solve():
      printit(s)

   end_time = time.time()
   delta_time = end_time - start_time

In the callback version, the loop is gone with just a single call to solve.

819

asked Apr 18 '11 14:04

sizzzzlerz

1 Answers

What i meant in my comment, ("yielding from a recursive function sounds like it requires extra for loops to pass the results down to the caller") is this line:

Click to copy

          for tf in self.solve():
             yield tf

These lines recursively loop over the results from the deeper recursion stages. That means that a single result is iterated over on each level of the recursion, resulting in a lot of unnecessary looping.

Let me illustrate with this example:

Click to copy

n = 0
def rekurse(z):
    global n
    if z:
        yield z
        for x in rekurse(z-1):
            n += 1
            yield x

print list(rekurse(10))
print n

As you can see this simply counts down from 10, so you'd expect a a linear number of iterations. What you can see though is that n grows quadratically - recurse(10) loops over 9 items, recurse(9) over 8 items and so on.

The more items you have, the more time Python spends on these simple lines. Callbacks completely avoid that problem, so I'd suspect that is the problem with your code.

A optimized implementation of PEP 380 could fix this (see this paragraph). In the meantime I don't think it's a good idea to yield from recursive functions (at least if they recurse deeply), they just don't work well together.

195

answered Nov 05 '22 01:11

Jochen Ritzel

Related questions
                            
                                Building an interleaved buffer for pyopengl and numpy
                            
                                Easy way to find "not enough arguments..." in python logging library
                            
                                When will Unladen Swallow be "done" or "ready" for real use?
                            
                                Online job-searching is tedious. Help me automate it
                            
                                Python IDLE compatible with multithreading?
                            
                                ModSecurity: Output filter: Failed to read bucket (rc 104): Connection reset by peer
                            
                                How do I delay a task using Celery?
                            
                                Python implementation of BLAST alignment algorithm? [closed]
                            
                                WSGI request and response wrappers for Python 3 [closed]
                            
                                Django and Python + uWSGI
                            
                                Cooperative multitasking using TPL
                            
                                Python 2.7: Themed "common dialog" tkinter interfaces via Ttk?
                            
                                OpenCV: memory leak with Python interface but not in the C version
                            
                                GTK+ Filechooser to open both files and folders
                            
                                Data corruption: Where's the bug‽
                            
                                py2exe and numpy not getting along
                            
                                Cannot convert 'vector<unsigned long>' to Python object
                            
                                Django registration alternatives
                            
                                Python: Clustering Search Engine Keywords
                            
                                Free Alternative to Ubigraph

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python generator vs callback function

Tags:

python

generator

profiling

sizzzzlerz

People also ask

1 Answers

Jochen Ritzel

Recent Activity

Donate For Us