I have found an interesting performance optimization. Instead of using all()
:
def matches(self, item):
return all(c.applies(item) for c in self.conditions)
I have profiled that it's faster when a loop is used instead:
def matches(self, item):
for condition in self.conditions:
if not condition.applies(item):
return False
return True
With all()
the profiler shows 1160 additional <genexpr>
calls:
4608 function calls (4600 primitive calls) in 0.015 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
580 0.002 0.000 0.008 0.000 rule.py:23(matches)
1160 0.002 0.000 0.005 0.000 rule.py:28(<genexpr>)
With the for
loop, there are no <genexpr>
calls:
2867 function calls (2859 primitive calls) in 0.012 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
580 0.002 0.000 0.006 0.000 rule.py:23(matches)
My question is where does the difference come from? My first though was that all
evaluates all conditions, but that's not the case:
def foo():
print('foo')
return False
all(foo() for _ in range(1000))
foo
To get the next sequence in a generator, next
is called on the generator object, so each iteration will have this call. The for loop does not incur this cost, so may be slightly faster in this case. Here, the generator does not do much of anything for you in the way of performance.
My first though was that all evaluates all conditions
all
will stop as soon as a falsy value is encountered. In contrast any
stops on the first truthy value; that may help frame the meaning of all
better.
Consider the following functions
def genfunc():
return all(i for i in range(1, 100))
def forfunc():
for i in range(1, 100):
if not i:
return False
return True
If we use dis.dis
to see what's going on...
dis.dis(genfunc)
0 LOAD_GLOBAL 0 (all)
2 LOAD_CONST 1 (<code object <genexpr> at 0x04D4E5A0, file "<ipython-input-2-60c0c9eff4e2>", line 2>)
4 LOAD_CONST 2 ('genfunc.<locals>.<genexpr>')
6 MAKE_FUNCTION 0
8 LOAD_GLOBAL 1 (range)
10 LOAD_CONST 3 (1)
12 LOAD_CONST 4 (100)
14 CALL_FUNCTION 2
16 GET_ITER
18 CALL_FUNCTION 1
20 CALL_FUNCTION 1
22 RETURN_VALUE
and in the for loop version..
dis.dis(forfunc)
0 SETUP_LOOP 26 (to 28)
2 LOAD_GLOBAL 0 (range)
4 LOAD_CONST 1 (1)
6 LOAD_CONST 2 (100)
8 CALL_FUNCTION 2
10 GET_ITER
>> 12 FOR_ITER 12 (to 26)
14 STORE_FAST 0 (i)
16 LOAD_FAST 0 (i)
18 POP_JUMP_IF_TRUE 12
20 LOAD_CONST 3 (False)
22 RETURN_VALUE
24 JUMP_ABSOLUTE 12
>> 26 POP_BLOCK
>> 28 LOAD_CONST 4 (True)
30 RETURN_VALUE
You'll note that in the generator expression version there are 2 additional function calls (CALL_FUNCTION
). This accounts for the additional 1160 calls (2 calls for each of the 580 loops) you see in the generator expression.version.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With