In the following code, why doesn't Python compile f2
to the same bytecode as f1
?
Is there a reason not to?
>>> def f1(x):
x*100
>>> dis.dis(f1)
2 0 LOAD_FAST 0 (x)
3 LOAD_CONST 1 (100)
6 BINARY_MULTIPLY
7 POP_TOP
8 LOAD_CONST 0 (None)
11 RETURN_VALUE
>>> def f2(x):
x*10*10
>>> dis.dis(f2)
2 0 LOAD_FAST 0 (x)
3 LOAD_CONST 1 (10)
6 BINARY_MULTIPLY
7 LOAD_CONST 1 (10)
10 BINARY_MULTIPLY
11 POP_TOP
12 LOAD_CONST 0 (None)
15 RETURN_VALUE
This is because x
could have a __mul__
method with side-effects. x * 10 * 10
calls __mul__
twice, while x * 100
only calls it once:
>>> class Foo(object):
... def __init__ (self):
... self.val = 5
... def __mul__ (self, other):
... print "Called __mul__: %s" % (other)
... self.val = self.val * other
... return self
...
>>> a = Foo()
>>> a * 10 * 10
Called __mul__: 10
Called __mul__: 10
<__main__.Foo object at 0x1017c4990>
Automatically folding the constants and only calling __mul__
once could change behavior.
You can get the optimization you want by reordering the operation such that the constants are multiplied first (or, as mentioned in the comments, using parentheses to group them such that they are merely operated on together, regardless of position), thus making explicit your desire for the folding to happen:
>>> def f1(x):
... return 10 * 10 * x
...
>>> dis.dis(f1)
2 0 LOAD_CONST 2 (100)
3 LOAD_FAST 0 (x)
6 BINARY_MULTIPLY
7 RETURN_VALUE
Python evaluates expressions from left to right. For f2()
, this means it will first evaluate x*10
and then multiply the result by 10. Try:
Try:
def f2(x):
10*10*x
This should be optimized.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With