The following Python 3.x integer multiplication takes on average between 1.66s and 1.77s: <pre class="prettyprint"><code>import time start_time = time.time() num = 0 for x in range(0, 10000000): # num += 2 * (x * x) num += 2 * x * x print("--- %s seconds ---" % (time.time() - start_time)) </code></pre> if I replace <code>2 * x * x</code> with <code>2 *(x * x)</code>, it takes between <code>2.04</code> and <code>2.25</code>. How come? On the other hand it is the opposite in Java: <code>2 * (x * x)</code> is faster in Java. Java test link: Why is 2 * (i * i) faster than 2 * i * i in Java? I ran each version of the program 10 times, here are the results. <pre class="prettyprint"><code> 2 * x * x | 2 * (x * x) --------------------------------------- 1.7717654705047607 | 2.0789272785186768 1.735931396484375 | 2.1166207790374756 1.7093875408172607 | 2.024367570877075 1.7004504203796387 | 2.047525405883789 1.6676218509674072 | 2.254328966140747 1.699510097503662 | 2.0949244499206543 1.6889283657073975 | 2.0841963291168213 1.7243537902832031 | 2.1290600299835205 1.712965488433838 | 2.1942825317382812 1.7622807025909424 | 2.1200053691864014 </code></pre>

First of all, note that we don't see the same thing in Python 2.x: <pre class="prettyprint"><code>>>> timeit("for i in range(1000): 2*i*i") 51.00784397125244 >>> timeit("for i in range(1000): 2*(i*i)") 50.48330092430115 </code></pre> So this leads us to believe that this is due to how integers changed in Python 3: specifically, Python 3 uses <code>long</code> (arbitrarily large integers) everywhere. For small enough integers (including the ones we're considering here), CPython actually just uses the O(MN) grade-school digit by digit multiplication algorithm (for larger integers it switches to the Karatsuba algorithm). You can see this yourself in the source. The number of digits in <code>x*x</code> is roughly twice that of <code>2*x</code> or <code>x</code> (since log(x2) = 2 log(x)). Note that a "digit" in this context is not a base-10 digit, but a 30-bit value (which are treated as single digits in CPython's implementation). Hence, <code>2</code> is a single-digit value, and <code>x</code> and <code>2*x</code> are single-digit values for all iterations of the loop, but <code>x*x</code> is two-digit for <code>x >= 2**15</code>. Hence, for <code>x >= 2**15</code>, <code>2*x*x</code> only requires single-by-single-digit multiplications whereas <code>2*(x*x)</code> requires a single-by-single and a single-by-double-digit multiplication (since <code>x*x</code> has 2 30-bit digits). Here's a direct way to see this (Python 3): <pre class="prettyprint"><code>>>> timeit("a*b", "a,b = 2, 123456**2", number=100000000) 5.796971936999967 >>> timeit("a*b", "a,b = 2*123456, 123456", number=100000000) 4.3559221399999615 </code></pre> Again, compare this to Python 2, which doesn't use arbitrary-length integers everywhere: <pre class="prettyprint"><code>>>> timeit("a*b", "a,b = 2, 123456**2", number=100000000) 3.0912468433380127 >>> timeit("a*b", "a,b = 2*123456, 123456", number=100000000) 3.1120400428771973 </code></pre> (One interesting note: If you look at the source, you'll see that the algorithm actually has a special case for squaring numbers (which we're doing here), but even still this is not enough to overcome the fact that <code>2*(x*x)</code> just requires processing more digits.)

Why is 2 * x * x faster than 2 * ( x * x ) in Python 3.x, for integers?

Tags:

performance

python

python-3.x

benchmarking

integer-arithmetic

The following Python 3.x integer multiplication takes on average between 1.66s and 1.77s:

import time start_time = time.time() num = 0 for x in range(0, 10000000):     # num += 2 * (x * x)     num += 2 * x * x print("--- %s seconds ---" % (time.time() - start_time))

if I replace 2 * x * x with 2 *(x * x), it takes between 2.04 and 2.25. How come?

On the other hand it is the opposite in Java: 2 * (x * x) is faster in Java. Java test link: Why is 2 * (i * i) faster than 2 * i * i in Java?

I ran each version of the program 10 times, here are the results.

   2 * x * x        |   2 * (x * x) --------------------------------------- 1.7717654705047607  | 2.0789272785186768 1.735931396484375   | 2.1166207790374756 1.7093875408172607  | 2.024367570877075 1.7004504203796387  | 2.047525405883789 1.6676218509674072  | 2.254328966140747 1.699510097503662   | 2.0949244499206543 1.6889283657073975  | 2.0841963291168213 1.7243537902832031  | 2.1290600299835205 1.712965488433838   | 2.1942825317382812 1.7622807025909424  | 2.1200053691864014

761

asked Dec 01 '18 12:12

Waqas Gondal

1 Answers

First of all, note that we don't see the same thing in Python 2.x:

>>> timeit("for i in range(1000): 2*i*i") 51.00784397125244 >>> timeit("for i in range(1000): 2*(i*i)") 50.48330092430115

So this leads us to believe that this is due to how integers changed in Python 3: specifically, Python 3 uses long (arbitrarily large integers) everywhere.

For small enough integers (including the ones we're considering here), CPython actually just uses the O(MN) grade-school digit by digit multiplication algorithm (for larger integers it switches to the Karatsuba algorithm). You can see this yourself in the source.

The number of digits in x*x is roughly twice that of 2*x or x (since log(x²) = 2 log(x)). Note that a "digit" in this context is not a base-10 digit, but a 30-bit value (which are treated as single digits in CPython's implementation). Hence, 2 is a single-digit value, and x and 2*x are single-digit values for all iterations of the loop, but x*x is two-digit for x >= 2**15. Hence, for x >= 2**15, 2*x*x only requires single-by-single-digit multiplications whereas 2*(x*x) requires a single-by-single and a single-by-double-digit multiplication (since x*x has 2 30-bit digits).

Here's a direct way to see this (Python 3):

>>> timeit("a*b", "a,b = 2, 123456**2", number=100000000) 5.796971936999967 >>> timeit("a*b", "a,b = 2*123456, 123456", number=100000000) 4.3559221399999615

Again, compare this to Python 2, which doesn't use arbitrary-length integers everywhere:

>>> timeit("a*b", "a,b = 2, 123456**2", number=100000000) 3.0912468433380127 >>> timeit("a*b", "a,b = 2*123456, 123456", number=100000000) 3.1120400428771973

(One interesting note: If you look at the source, you'll see that the algorithm actually has a special case for squaring numbers (which we're doing here), but even still this is not enough to overcome the fact that 2*(x*x) just requires processing more digits.)

answered Sep 28 '22 04:09

arshajii

Related questions
                            
                                Difference between nonzero(a), where(a) and argwhere(a). When to use which?
                            
                                How do you organise a python project that contains multiple packages so that each file in a package can still be run individually?
                            
                                What path to install Python 3.6 to on Windows?
                            
                                What is the effect of "list=list" in Python modules?
                            
                                On what CPU cores are my Python processes running?
                            
                                IOError: request data read error
                            
                                Setting up setup.py for packaging of a single .py file and a single data file without needing to create any folders
                            
                                Setting variables with exec inside a function
                            
                                What's the best way to distribute python command-line tools?
                            
                                Default sub-command, or handling no sub-command with argparse
                            
                                Python dynamic inheritance: How to choose base class upon instance creation?
                            
                                Difference between frompyfunc and vectorize in numpy
                            
                                LSTM Autoencoder
                            
                                how to reverse the URL of a ViewSet's custom action in django restframework
                            
                                Why is the compiler package discontinued in Python 3?
                            
                                Use pdb.set_trace() in a script that reads stdin via a pipe
                            
                                Is it possible to vectorize recursive calculation of a NumPy array where each element depends on the previous one?
                            
                                Break on unhandled exception in pycharm
                            
                                Who runs the callback when using apply_async method of a multiprocessing pool?
                            
                                Python logging configuration file

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With