Consider the following functions: <pre class="prettyprint"><code>def fact1(n): if n < 2: return 1 else: return n * fact1(n-1) def fact2(n): if n < 2: return 1 return n * fact2(n-1) </code></pre> They should be equivalent. But there's a performance difference: <pre class="prettyprint"><code>>>> T(lambda : fact1(1)).repeat(number=10000000) [2.5754408836364746, 2.5710129737854004, 2.5678811073303223] >>> T(lambda : fact2(1)).repeat(number=10000000) [2.8432059288024902, 2.834425926208496, 2.8364310264587402] </code></pre> The version without the <code>else</code> is 10% slower. This is pretty significant. Why?

What is happening here is that <code>fact2</code> has a hash conflict with <code>__name__</code> in your module globals. That makes the lookup of the global <code>fact2</code> ever so slightly slower. <pre class="prettyprint"><code>>>> [(k, hash(k) % 32) for k in globals().keys() ] [('__builtins__', 8), ('__package__', 15), ('fact2', 25), ('__name__', 25), ('fact1', 26), ('__doc__', 29)] </code></pre> i.e. The same answer as for Why is early return slower than else? except that there the hash conflict was with <code>__builtins__</code>

Why does removing the else slow down my code?

Tags:

performance

python

recursion

Consider the following functions:

def fact1(n):
    if n < 2:
        return 1
    else:
        return n * fact1(n-1)

def fact2(n):
    if n < 2:
        return 1
    return n * fact2(n-1)

They should be equivalent. But there's a performance difference:

>>> T(lambda : fact1(1)).repeat(number=10000000)
[2.5754408836364746, 2.5710129737854004, 2.5678811073303223]
>>> T(lambda : fact2(1)).repeat(number=10000000)
[2.8432059288024902, 2.834425926208496, 2.8364310264587402]

The version without the else is 10% slower. This is pretty significant. Why?

491

asked Nov 20 '11 18:11

Aillyn

3 Answers

For me, they are virtually the same speed: (Python 2.6.6 on Debian)

In [4]: %timeit fact1(1)
10000000 loops, best of 3: 151 ns per loop

In [5]: %timeit fact2(1)
10000000 loops, best of 3: 154 ns per loop

The byte code is also very similar:

In [6]: dis.dis(fact1)
  2           0 LOAD_FAST                0 (n)
              3 LOAD_CONST               1 (2)
              6 COMPARE_OP               0 (<)
              9 JUMP_IF_FALSE            5 (to 17)
             12 POP_TOP             

  3          13 LOAD_CONST               2 (1)
             16 RETURN_VALUE        
        >>   17 POP_TOP             

  5          18 LOAD_FAST                0 (n)
             21 LOAD_GLOBAL              0 (fact)
             24 LOAD_FAST                0 (n)
             27 LOAD_CONST               2 (1)
             30 BINARY_SUBTRACT     
             31 CALL_FUNCTION            1
             34 BINARY_MULTIPLY     
             35 RETURN_VALUE        
             36 LOAD_CONST               0 (None)
             39 RETURN_VALUE        

In [7]: dis.dis(fact2)
  2           0 LOAD_FAST                0 (n)
              3 LOAD_CONST               1 (2)
              6 COMPARE_OP               0 (<)
              9 JUMP_IF_FALSE            5 (to 17)
             12 POP_TOP             

  3          13 LOAD_CONST               2 (1)
             16 RETURN_VALUE        
        >>   17 POP_TOP             

  4          18 LOAD_FAST                0 (n)
             21 LOAD_GLOBAL              0 (fact)
             24 LOAD_FAST                0 (n)
             27 LOAD_CONST               2 (1)
             30 BINARY_SUBTRACT     
             31 CALL_FUNCTION            1
             34 BINARY_MULTIPLY     
             35 RETURN_VALUE

The only difference is that the version with the else includes code to return None in case control reaches the end of the function body.

answered Oct 13 '22 15:10

Sven Marnach

What is happening here is that fact2 has a hash conflict with __name__ in your module globals. That makes the lookup of the global fact2 ever so slightly slower.

>>> [(k, hash(k) % 32) for k in globals().keys() ]
[('__builtins__', 8), ('__package__', 15), ('fact2', 25), ('__name__', 25), ('fact1', 26), ('__doc__', 29)]

i.e. The same answer as for Why is early return slower than else? except that there the hash conflict was with __builtins__

answered Oct 13 '22 14:10

Duncan

I question the timings. The two functions aren't recursing to themselves. fact1 and fact2 both call fact which isn't shown.

Once that is fixed, the disassembly (in both Py2.6 and Py2.7) shows that both are running the same op codes except for the name of the recursed into function. The choice of name trigger a small difference in timings because fact1 may insert in the module dictionary with no name collisions while *fact2) may have a hash value that collides with another name in the module.

In other words, any differences you see in timings are not due to the choice of whether the else-clause is present :-)

answered Oct 13 '22 15:10

Raymond Hettinger

Related questions
                            
                                numpy.ndarray objects not garbage collected
                            
                                Python regex findall alternation behavior
                            
                                pytest setup_class() after fixture initialization
                            
                                Adding a new line character to a variable in python [duplicate]
                            
                                MNLogit in statsmodel returning nan
                            
                                How To Install PyBluez On Windows 8.1?
                            
                                Average over parts in list of lists
                            
                                Equivalent of count list function in numpy array
                            
                                Alternatives to variable-width lookbehind in Python regex
                            
                                Django assertTemplateUsed() throws exception with Jinja templates
                            
                                Start scrapy from Flask route
                            
                                python elementtree xml append
                            
                                viewflow.io: implementing a queue task
                            
                                Add jar to pyspark when using notebook
                            
                                Overlapping iteration over theano tensor
                            
                                cx_Freeze fails to include Cython .pyx module
                            
                                Search for documents within a given radius with Couchbase
                            
                                subprocess.wait() not waiting for Popen process to finish (when using threads)?
                            
                                Grouping boxplots in seaborn when input is a DataFrame
                            
                                How to do "insert if not exist else update" with mongoengine?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With