Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is math.sqrt massively slower than exponentiation?

I can't believe what I just measured:

python3 -m timeit -s "from math import sqrt" "sqrt(2)"
5000000 loops, best of 5: 42.8 nsec per loop

python3 -m timeit "2 ** 0.5"
50000000 loops, best of 5: 4.93 nsec per loop

This goes against any intuition... it shoud be exactly the opposite!

Python 3.8.3 on macOS Catalina

like image 503
Peter Leikauf Avatar asked Jun 17 '20 14:06

Peter Leikauf


People also ask

Is NP sqrt faster than math sqrt?

For scalar math, Python functions are faster than numpy functions. It turns out that the sqrt() function from the standard Python math module is about seven times faster than the corresponding sqrt() function from numpy.

How do you do square root in Python?

sqrt() function is an inbuilt function in Python programming language that returns the square root of any number. Syntax: math. sqrt(x) Parameter: x is any number such that x>=0 Returns: It returns the square root of the number passed in the parameter.


Video Answer


3 Answers

Python 3 is precomputing the value of 2 ** 0.5 at compile time, since both operands are known at that time. The value of sqrt, however, is not known at compile time, so the computation necessarily occurs at run time.

You aren't timing how long it takes to compute 2 ** 0.5, but just the time it takes to load a constant.

A fairer comparison would be

$ python3 -m timeit -s "from math import sqrt" "sqrt(2)"
5000000 loops, best of 5: 50.7 nsec per loop
$ python3 -m timeit -s "x = 2" "x**0.5"
5000000 loops, best of 5: 56.7 nsec per loop

I'm not sure if there is a way to show unoptimized byte code. Python starts by parsing source code into an abstract syntax tree (AST):

>>> ast.dump(ast.parse("2**0.5"))
'Module(body=[Expr(value=BinOp(left=Num(n=2), op=Pow(), right=Num(n=0.5)))])'

Update: This particular optimization is now applied directly to the abstract syntax tree, so the byte code is generated directly from something like

Module(body=Num(n= 1.4142135623730951))

The ast module doesn't appear to apply the optimization.

The compiler takes the AST and generates unoptimized byte code; in this case, I believe it would look (based on the output of dis.dis("2**x") and dis.dis("x**0.5")) like

LOAD_CONST       0  (2)
LOAD_CONST       1  (0.5)
BINARY_POWER
RETURN_VALUE

The raw byte code is then subject to modification by the peephole optimzizer, which can reduce these 4 instructions to 2, as shown by the dis module.

The compiler then generates byte code from the AST.

>>> dis.dis("2**0.5")
  1           0 LOAD_CONST               0 (1.4142135623730951)
              2 RETURN_VALUE

[While the following paragraph was originally written with the idea of optimizing byte code in mind, the reasoning applies to optimizing the AST as well.]

Since nothing at runtime affects how the two LOAD_CONST and following BINARY_POWER instruction are evaluated (for example, there are no name lookups), the peephole optimizer can take this sequence of byte codes, perform the computation of 2**0.5 itself, and replace the first three instructions with a single LOAD_CONST instruction that loads the result immediately.

like image 160
chepner Avatar answered Oct 11 '22 11:10

chepner


To enhance chepner's answer, here's a proof:

Python 3.5.3 (default, Sep 27 2018, 17:25:39) 
[GCC 6.3.0 20170516] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import dis
>>> dis.dis('2 ** 0.5')
  1           0 LOAD_CONST               2 (1.4142135623730951)
              3 RETURN_VALUE

vs.

>>> dis.dis('sqrt(2)')
  1           0 LOAD_NAME                0 (sqrt)
              3 LOAD_CONST               0 (2)
              6 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
              9 RETURN_VALUE
like image 25
mkrieger1 Avatar answered Oct 11 '22 09:10

mkrieger1


>>> dis.dis('44442.3123 ** 0.5')
          0 LOAD_CONST               0 (210.81345379268373)
          2 RETURN_VALUE

I do not believe, that 44442.3123 ** 0.5 is precomputed at compile time. We should better check the AST of code.

>>> import ast
>>> import math
>>> code = ast.parse("2**2")
>>> ast.dump(code)
'Module(body=[Expr(value=BinOp(left=Num(n=2), op=Pow(), right=Num(n=2)))])'
>>> code = ast.parse("math.sqrt(3)")
>>> ast.dump(code)
"Module(body=[Expr(value=Call(func=Attribute(value=Name(id='math', ctx=Load()), attr='sqrt', ctx=Load()), args=[Num(n=3)], keywords=[]))])"
like image 2
akostrikov Avatar answered Oct 11 '22 11:10

akostrikov