Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is divmod() faster than using the % and // operators?

I remember from assembly that integer division instructions yield both the quotient and remainder. So, in python will the built-in divmod() function be better performance-wise than using the % and // operators (suppose of course one needs both the quotient and the remainder)?

q, r = divmod(n, d) q, r = (n // d, n % d) 
like image 657
smichak Avatar asked May 06 '15 14:05

smichak


People also ask

Is divmod faster?

when dividing a 22-million-digit number, divmod is almost exactly twice as fast as doing the division and modulus separately, as you might expect.

What does divmod () do in Python?

Python divmod() Function The divmod() function returns a tuple containing the quotient and the remainder when argument1 (dividend) is divided by argument2 (divisor).

What is divmod in Python example?

The divmod() is part of python's standard library which takes two numbers as parameters and gives the quotient and remainder of their division as a tuple. It is useful in many mathematical applications like checking for divisibility of numbers and establishing if a number is prime or not.


2 Answers

To measure is to know (all timings on a Macbook Pro 2.8Ghz i7):

>>> import sys, timeit >>> sys.version_info sys.version_info(major=2, minor=7, micro=12, releaselevel='final', serial=0) >>> timeit.timeit('divmod(n, d)', 'n, d = 42, 7') 0.1473848819732666 >>> timeit.timeit('n // d, n % d', 'n, d = 42, 7') 0.10324406623840332 

The divmod() function is at a disadvantage here because you need to look up the global each time. Binding it to a local (all variables in a timeit time trial are local) improves performance a little:

>>> timeit.timeit('dm(n, d)', 'n, d = 42, 7; dm = divmod') 0.13460898399353027 

but the operators still win because they don't have to preserve the current frame while a function call to divmod() is executed:

>>> import dis >>> dis.dis(compile('divmod(n, d)', '', 'exec'))   1           0 LOAD_NAME                0 (divmod)               3 LOAD_NAME                1 (n)               6 LOAD_NAME                2 (d)               9 CALL_FUNCTION            2              12 POP_TOP                           13 LOAD_CONST               0 (None)              16 RETURN_VALUE         >>> dis.dis(compile('(n // d, n % d)', '', 'exec'))   1           0 LOAD_NAME                0 (n)               3 LOAD_NAME                1 (d)               6 BINARY_FLOOR_DIVIDE                7 LOAD_NAME                0 (n)              10 LOAD_NAME                1 (d)              13 BINARY_MODULO                     14 BUILD_TUPLE              2              17 POP_TOP                           18 LOAD_CONST               0 (None)              21 RETURN_VALUE         

The // and % variant uses more opcodes, but the CALL_FUNCTION bytecode is a bear, performance wise.


In PyPy, for small integers there isn't really much of a difference; the small speed advantage the opcodes have melts away under the sheer speed of C integer arithmetic:

>>>> import platform, sys, timeit >>>> platform.python_implementation(), sys.version_info ('PyPy', (major=2, minor=7, micro=10, releaselevel='final', serial=42)) >>>> timeit.timeit('divmod(n, d)', 'n, d = 42, 7', number=10**9) 0.5659301280975342 >>>> timeit.timeit('n // d, n % d', 'n, d = 42, 7', number=10**9) 0.5471200942993164 

(I had to crank the number of repetitions up to 1 billion to show how small the difference really is, PyPy is blazingly fast here).

However, when the numbers get large, divmod() wins by a country mile:

>>>> timeit.timeit('divmod(n, d)', 'n, d = 2**74207281 - 1, 26', number=100) 17.620037078857422 >>>> timeit.timeit('n // d, n % d', 'n, d = 2**74207281 - 1, 26', number=100) 34.44323515892029 

(I now had to tune down the number of repetitions by a factor of 10 compared to hobbs' numbers, just to get a result in a reasonable amount of time).

This is because PyPy no longer can unbox those integers as C integers; you can see the striking difference in timings between using sys.maxint and sys.maxint + 1:

>>>> timeit.timeit('divmod(n, d)', 'import sys; n, d = sys.maxint, 26', number=10**7) 0.008622884750366211 >>>> timeit.timeit('n // d, n % d', 'import sys; n, d = sys.maxint, 26', number=10**7) 0.007693052291870117 >>>> timeit.timeit('divmod(n, d)', 'import sys; n, d = sys.maxint + 1, 26', number=10**7) 0.8396248817443848 >>>> timeit.timeit('n // d, n % d', 'import sys; n, d = sys.maxint + 1, 26', number=10**7) 1.0117690563201904 
like image 64
Martijn Pieters Avatar answered Sep 20 '22 03:09

Martijn Pieters


Martijn's answer is correct if you're using "small" native integers, where arithmetic operations are very fast compared to function calls. However, with bigints, it's a whole different story:

>>> import timeit >>> timeit.timeit('divmod(n, d)', 'n, d = 2**74207281 - 1, 26', number=1000) 24.22666597366333 >>> timeit.timeit('n // d, n % d', 'n, d = 2**74207281 - 1, 26', number=1000) 49.517399072647095 

when dividing a 22-million-digit number, divmod is almost exactly twice as fast as doing the division and modulus separately, as you might expect.

On my machine, the crossover occurs somewhere around 2^63, but don't take my word for it. As Martijn says, measure! When performance really matters, don't assume that what held true in one place will still be true in another.

like image 27
hobbs Avatar answered Sep 20 '22 03:09

hobbs