Why is math.sqrt massively slower than exponentiation?

Tags:

I can't believe what I just measured:

python3 -m timeit -s "from math import sqrt" "sqrt(2)"
5000000 loops, best of 5: 42.8 nsec per loop

python3 -m timeit "2 ** 0.5"
50000000 loops, best of 5: 4.93 nsec per loop

This goes against any intuition... it shoud be exactly the opposite!

Python 3.8.3 on macOS Catalina

503

asked Jun 17 '20 14:06

Peter Leikauf

Video Answer

3 Answers

Python 3 is precomputing the value of 2 ** 0.5 at compile time, since both operands are known at that time. The value of sqrt, however, is not known at compile time, so the computation necessarily occurs at run time.

You aren't timing how long it takes to compute 2 ** 0.5, but just the time it takes to load a constant.

A fairer comparison would be

$ python3 -m timeit -s "from math import sqrt" "sqrt(2)"
5000000 loops, best of 5: 50.7 nsec per loop
$ python3 -m timeit -s "x = 2" "x**0.5"
5000000 loops, best of 5: 56.7 nsec per loop

I'm not sure if there is a way to show unoptimized byte code. Python starts by parsing source code into an abstract syntax tree (AST):

>>> ast.dump(ast.parse("2**0.5"))
'Module(body=[Expr(value=BinOp(left=Num(n=2), op=Pow(), right=Num(n=0.5)))])'

Update: This particular optimization is now applied directly to the abstract syntax tree, so the byte code is generated directly from something like

Module(body=Num(n= 1.4142135623730951))

The ast module doesn't appear to apply the optimization.

~~The compiler takes the AST and generates unoptimized byte code; in this case, I believe it would look (based on the output of dis.dis("2**x") and dis.dis("x**0.5")) like~~

LOAD_CONST       0  (2)
LOAD_CONST       1  (0.5)
BINARY_POWER
RETURN_VALUE

The raw byte code is then subject to modification by the peephole optimzizer, which can reduce these 4 instructions to 2, as shown by the dis module.

The compiler then generates byte code from the AST.

>>> dis.dis("2**0.5")
  1           0 LOAD_CONST               0 (1.4142135623730951)
              2 RETURN_VALUE

[While the following paragraph was originally written with the idea of optimizing byte code in mind, the reasoning applies to optimizing the AST as well.]

Since nothing at runtime affects how the two LOAD_CONST and following BINARY_POWER instruction are evaluated (for example, there are no name lookups), the peephole optimizer can take this sequence of byte codes, perform the computation of 2**0.5 itself, and replace the first three instructions with a single LOAD_CONST instruction that loads the result immediately.

160

answered Oct 11 '22 11:10

chepner

To enhance chepner's answer, here's a proof:

Python 3.5.3 (default, Sep 27 2018, 17:25:39) 
[GCC 6.3.0 20170516] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import dis
>>> dis.dis('2 ** 0.5')
  1           0 LOAD_CONST               2 (1.4142135623730951)
              3 RETURN_VALUE

vs.

>>> dis.dis('sqrt(2)')
  1           0 LOAD_NAME                0 (sqrt)
              3 LOAD_CONST               0 (2)
              6 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
              9 RETURN_VALUE

answered Oct 11 '22 09:10

mkrieger1

>>> dis.dis('44442.3123 ** 0.5')
          0 LOAD_CONST               0 (210.81345379268373)
          2 RETURN_VALUE

I do not believe, that 44442.3123 ** 0.5 is precomputed at compile time. We should better check the AST of code.

>>> import ast
>>> import math
>>> code = ast.parse("2**2")
>>> ast.dump(code)
'Module(body=[Expr(value=BinOp(left=Num(n=2), op=Pow(), right=Num(n=2)))])'
>>> code = ast.parse("math.sqrt(3)")
>>> ast.dump(code)
"Module(body=[Expr(value=Call(func=Attribute(value=Name(id='math', ctx=Load()), attr='sqrt', ctx=Load()), args=[Num(n=3)], keywords=[]))])"

answered Oct 11 '22 11:10

akostrikov

Related questions
                            
                                Python, Overriding an inherited class method
                            
                                How to access data when form.is_valid() is false
                            
                                How to set another Inline title in Django Admin?
                            
                                Python Script to convert Image into Byte array
                            
                                Difference between "fill" and "expand" options for tkinter pack method
                            
                                How can I select all rows with sqlalchemy?
                            
                                Editing django-rest-framework serializer object before save
                            
                                Grouping Python dictionary keys as a list and create a new dictionary with this list as a value
                            
                                iterating quickly through list of tuples
                            
                                How do I run uwsgi with virtualenv
                            
                                How to detect lines in OpenCV?
                            
                                Getting model attributes from pipeline
                            
                                Stratified Sampling in Pandas
                            
                                Mapping a NumPy array in place
                            
                                Can one partially apply the second argument of a function that takes no keyword arguments?
                            
                                Drop non-numeric columns from a pandas DataFrame [duplicate]
                            
                                logging with filters
                            
                                compile python .py file without executing
                            
                                python pandas- apply function with two arguments to columns
                            
                                RuntimeError: tf.placeholder() is not compatible with eager execution

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why is math.sqrt massively slower than exponentiation?

Tags:

performance

python

macos-catalina

python-3.8