I wanted to see how much faster reduce was than using a for loop for simple numerical operations. Here's what I found (using the standard timeit library):
In [54]: print(setup)
from operator import add, iadd
r = range(100)
In [55]: print(stmt1)
c = 0
for i in r:
c+=i
In [56]: timeit(stmt1, setup)
Out[56]: 8.948904991149902
In [58]: print(stmt3)
reduce(add, r)
In [59]: timeit(stmt3, setup)
Out[59]: 13.316915035247803
Looking a little more:
In [68]: timeit("1+2", setup)
Out[68]: 0.04145693778991699
In [69]: timeit("add(1,2)", setup)
Out[69]: 0.22807812690734863
What's going on here? Obviously, reduce does loop faster than for, but the function call seems to dominate. Shouldn't the reduce version run almost entirely in C? Using iadd(c,i) in the for loop version makes it run in ~24 seconds. Why would using operator.add be so much slower than +? I was under the impression + and operator.add run the same C code (I checked to make sure operator.add wasn't just calling + in python or anything).
BTW, just using sum runs in ~2.3 seconds.
In [70]: print(sys.version)
2.7.1 (r271:86882M, Nov 30 2010, 09:39:13)
[GCC 4.0.1 (Apple Inc. build 5494)]
The reduce(add, r)
must invoke the add()
function 100 times, so the overhead of the function calls adds up -- reduce uses PyEval_CallObject
to invoke add
on each iteration:
for (;;) {
...
if (result == NULL)
result = op2;
else {
# here it is creating a tuple to pass the previous result and the next
# value from range(100) into func add():
PyTuple_SetItem(args, 0, result);
PyTuple_SetItem(args, 1, op2);
if ((result = PyEval_CallObject(func, args)) == NULL)
goto Fail;
}
Updated: Response to question in comments.
When you type 1 + 2
in Python source code, the bytecode compiler performs the addition in place and replaces that expression with 3
:
f1 = lambda: 1 + 2
c1 = byteplay.Code.from_code(f1.func_code)
print c1.code
1 1 LOAD_CONST 3
2 RETURN_VALUE
If you add two variables a + b
the compiler will generate bytecode which loads the two variables and performs a BINARY_ADD, which is far faster than calling a function to perform the addition:
f2 = lambda a, b: a + b
c2 = byteplay.Code.from_code(f2.func_code)
print c2.code
1 1 LOAD_FAST a
2 LOAD_FAST b
3 BINARY_ADD
4 RETURN_VALUE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With