Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimization of arithmetic expressions - what is this technique called?

A discussion with a friend led to the following realization:

>>> import dis
>>> i = lambda n: n*24*60*60
>>> dis.dis(i)
  1           0 LOAD_FAST                0 (n)
              3 LOAD_CONST               1 (24)
              6 BINARY_MULTIPLY
              7 LOAD_CONST               2 (60)
             10 BINARY_MULTIPLY
             11 LOAD_CONST               2 (60)
             14 BINARY_MULTIPLY
             15 RETURN_VALUE
>>> k = lambda n: 24*60*60*n
>>> dis.dis(k)
  1           0 LOAD_CONST               4 (86400)
              3 LOAD_FAST                0 (n)
              6 BINARY_MULTIPLY
              7 RETURN_VALUE

The second example is clearly more efficient simply by reducing the number of instructions.

My question is, is there a name for this optimization, and why doesn't it happen in the first example?

Also, I'm not sure if this is a duplicate of Why doesn't GCC optimize a*a*a*a*a*a to (a*a*a)*(a*a*a)? ; if it is please explain a bit further as it applies to Python.

like image 475
Burhan Khalid Avatar asked May 12 '16 06:05

Burhan Khalid


People also ask

What is arithmetic expression in programming?

What Does Arithmetic Expression Mean? An arithmetic expression is an expression in code that consists of a numeric value.

What is evaluation of arithmetic expression?

Parentheses may be used in expressions to specify the order of evaluation. Expressions within parentheses are evaluated first. When parentheses are nested, the innermost set of parentheses is evaluated first, and then successively more inclusive parentheses are evaluated.

What are the three ways to represent arithmetic expression?

Infix,prefix and postfix notations are different ways of writing expression. In the 3 ways, the operands occur in the same order but the operators have to be moved.

What is arithmetic expression in data structure?

An arithmetic expression can be written in three different but equivalent notations, i.e., without changing the essence or output of an expression. These notations are − Infix Notation. Prefix (Polish) Notation. Postfix (Reverse-Polish) Notation.


1 Answers

This optimization technique is called constant folding.

The reason for constant folding occurring in the latter code but not in the former is that Python has dynamic typing, and while in mathematics a product of real numbers is commutative and freely associative, it is not so in Python in the general case, because neither do all variables contain real numbers, nor can one know the types beforehand.


Multiplication in Python is left-associative - 24 * 60 * 60 * n behaves like (((24 * 60) * 60) * n), which in turn implicitly executes like

(24).__mul__(60).__mul__(60).__mul__(n)

or

(n).__rmul_((24).__mul__(60).__mul__(60))

whereas n * 24 * 60 * 60 which is (((n * 24) * 60) * 60) can behave like

n.__mul__(24).__mul__(60).__mul__(60)

or

(24).__rmul__(n).__mul__(60).__mul__(60)

Since we cannot know the behaviour of n.__mul__ beforehand, we cannot fold a constant in the latter case. Consider this example of a funny class that returns a subclass of int that defines __mul__/__rmul__ as returning the sum of the operands instead of product:

class MultiplyAsAdd(int):
    def __mul__(self, other):
        return MultiplyAsAdd(self + other)
    def __rmul__(self, other):
        return MultiplyAsAdd(other + self)

Then

>>> (lambda n: 24*60*60*n)(MultiplyAsAdd(5))
86405
>>> (lambda n: n*24*60*60)(MultiplyAsAdd(5))
149

Clearly it'd be wrong for Python to parenthesize the product as n*(24*60*60) in the latter case.