Python: why are * and ** faster than / and sqrt()?

Tags:

While optimising my code I realised the following:

>>> from timeit import Timer as T >>> T(lambda : 1234567890 / 4.0).repeat() [0.22256922721862793, 0.20560789108276367, 0.20530295372009277] >>> from __future__ import division >>> T(lambda : 1234567890 / 4).repeat() [0.14969301223754883, 0.14155197143554688, 0.14141488075256348] >>> T(lambda : 1234567890 * 0.25).repeat() [0.13619112968444824, 0.1281130313873291, 0.12830305099487305]

and also:

>>> from math import sqrt >>> T(lambda : sqrt(1234567890)).repeat() [0.2597470283508301, 0.2498021125793457, 0.24994492530822754] >>> T(lambda : 1234567890 ** 0.5).repeat() [0.15409398078918457, 0.14059877395629883, 0.14049601554870605]

I assume it has to do with the way python is implemented in C, but I wonder if anybody would care to explain why is so?

353

asked Nov 09 '11 16:11

mac

1 Answers

The (somewhat unexpected) reason for your results is that Python seems to fold constant expressions involving floating-point multiplication and exponentiation, but not division. math.sqrt() is a different beast altogether since there's no bytecode for it and it involves a function call.

On Python 2.6.5, the following code:

x1 = 1234567890.0 / 4.0 x2 = 1234567890.0 * 0.25 x3 = 1234567890.0 ** 0.5 x4 = math.sqrt(1234567890.0)

compiles to the following bytecodes:

  # x1 = 1234567890.0 / 4.0   4           0 LOAD_CONST               1 (1234567890.0)               3 LOAD_CONST               2 (4.0)               6 BINARY_DIVIDE                      7 STORE_FAST               0 (x1)    # x2 = 1234567890.0 * 0.25   5          10 LOAD_CONST               5 (308641972.5)              13 STORE_FAST               1 (x2)    # x3 = 1234567890.0 ** 0.5   6          16 LOAD_CONST               6 (35136.418286444619)              19 STORE_FAST               2 (x3)    # x4 = math.sqrt(1234567890.0)   7          22 LOAD_GLOBAL              0 (math)              25 LOAD_ATTR                1 (sqrt)              28 LOAD_CONST               1 (1234567890.0)              31 CALL_FUNCTION            1              34 STORE_FAST               3 (x4)

As you can see, multiplication and exponentiation take no time at all since they're done when the code is compiled. Division takes longer since it happens at runtime. Square root is not only the most computationally expensive operation of the four, it also incurs various overheads that the others do not (attribute lookup, function call etc).

If you eliminate the effect of constant folding, there's little to separate multiplication and division:

In [16]: x = 1234567890.0  In [17]: %timeit x / 4.0 10000000 loops, best of 3: 87.8 ns per loop  In [18]: %timeit x * 0.25 10000000 loops, best of 3: 91.6 ns per loop

math.sqrt(x) is actually a little bit faster than x ** 0.5, presumably because it's a special case of the latter and can therefore be done more efficiently, in spite of the overheads:

In [19]: %timeit x ** 0.5 1000000 loops, best of 3: 211 ns per loop  In [20]: %timeit math.sqrt(x) 10000000 loops, best of 3: 181 ns per loop

edit 2011-11-16: Constant expression folding is done by Python's peephole optimizer. The source code (peephole.c) contains the following comment that explains why constant division isn't folded:

    case BINARY_DIVIDE:         /* Cannot fold this operation statically since            the result can depend on the run-time presence            of the -Qnew flag */         return 0;

The -Qnew flag enables "true division" defined in PEP 238.

100

answered Oct 13 '22 05:10

12 revs

Related questions
                            
                                Django form - set label
                            
                                How to obtain values of request variables using Python and Flask [duplicate]
                            
                                how to store a complex object in redis (using redis-py)
                            
                                How to import a csv file using python with headers intact, where first column is a non-numerical
                            
                                Finding elements not in a list
                            
                                Common pitfalls in Python [duplicate]
                            
                                Combining two sorted lists in Python
                            
                                Random is barely random at all?
                            
                                Understanding metaclass and inheritance in Python [duplicate]
                            
                                Python MySQLdb: connection.close() VS. cursor.close()
                            
                                for x in y(): how does this work? [duplicate]
                            
                                Celery difference between concurrency, workers and autoscaling
                            
                                Passing all arguments of a function to another function
                            
                                How to properly create and run concurrent tasks using python's asyncio module?
                            
                                Conda uninstall one package and one package only
                            
                                Concatenate Numpy arrays without copying
                            
                                Python Virtualenv - No module named virtualenvwrapper.hook_loader
                            
                                How do I write to a Python subprocess' stdin?
                            
                                Python: check if an object is a sequence
                            
                                Custom PyCharm docstring stubs (i.e. for google docstring or numpydoc formats)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python: why are * and ** faster than / and sqrt()?

Tags:

performance

python

c

python-internals

python-2.7

mac

People also ask

1 Answers

12 revs

Recent Activity

Donate For Us