I'm not sure if this should go in cs or programmers stack exchange instead, so if it should please let me know. I know that in other languages, continue
can become problematic because it essentially performs an unconditional branch, which causes performance issues on pipelined machines because in order to pipeline effectively, we need to be able to know what instruction comes next, which unconditional branching can prevent. Which could be prevented by using more solid logic. That being said, I know that in Python, continue
can improve performance within a loop, however, I was hoping for more than anecdotal evidence to support that. Can anyone elaborate on why that is the case in Python? Also, is the use of continue
considered "proper" logic (i.e. Pythonic) within Python? Thanks!
I'm also curious about the performance impact. As mentioned above, in other languages, continue
is basically an unconditional branch. Obviously this can cause performance issues on pipelined machines. So I'm curious, is there a an inherent performance overhead when using continue
in Python as well, even if there is a performance gain when used within a loop?
Here in the example code (an attempted rework of insertion sort, ignore the overall logic, this stemmed from some experimentation to attempt get better than O(n2) performance), if we have a case where we can determine where the next element should go (at the head or tail), we can easily put it there and ignore the rest of the loop using the continue
statement
def insertion_sort(array):
sublist = [array[0]]
for i in range(1, len(array)):
if array[i] >= sublist[i-1]:
sublist.append(array[i])
continue #possible performance loss here
if array[i] <= sublist[0]:
sublist.insert(0, array[i])
continue #possible performance loss here
for j in reversed(range(0, i)):
if array[i] >= sublist[j-1] and array [i] <=sublist[j]:
sublist.insert(j, array[i])
break #possible performance loss here
return sublist
In this case you're using continue
as a means of "selection" in the case of mutually-exclusive conditions. Consider the following simpler examples...
def f1():
for i in range(10):
if i < 5:
print 'i is less than 5'
if i > 5:
print 'i is greater than 5'
def f2():
for i in range(10):
if i < 5:
print 'i is less than 5'
continue
if i > 5:
print 'i is greater than 5'
The two functions will produce identical outputs, but in the first case, given that we know that i
cannot be both greater than and less than 5 at the same time, the first function performs an unnecessary comparison, and will likely take longer to execute.
The second example is more commonly expressed as...
def f3():
for i in range(10):
if i < 5:
print 'i is less than 5'
elif i > 5:
print 'i is greater than 5'
As for which is more performant, you can take a look at the Python dis
module, which for f2()
yields...
0 SETUP_LOOP 63 (to 66)
3 LOAD_GLOBAL 0 (range)
6 LOAD_CONST 1 (10)
9 CALL_FUNCTION 1
12 GET_ITER
>> 13 FOR_ITER 49 (to 65)
16 STORE_FAST 0 (i)
19 LOAD_FAST 0 (i)
22 LOAD_CONST 2 (5)
25 COMPARE_OP 0 (<)
28 POP_JUMP_IF_FALSE 42
31 LOAD_CONST 3 ('i is less than 5')
34 PRINT_ITEM
35 PRINT_NEWLINE
36 JUMP_ABSOLUTE 13
39 JUMP_FORWARD 0 (to 42)
>> 42 LOAD_FAST 0 (i)
45 LOAD_CONST 2 (5)
48 COMPARE_OP 4 (>)
51 POP_JUMP_IF_FALSE 13
54 LOAD_CONST 4 ('i is greater than 5')
57 PRINT_ITEM
58 PRINT_NEWLINE
59 JUMP_ABSOLUTE 13
62 JUMP_ABSOLUTE 13
>> 65 POP_BLOCK
>> 66 LOAD_CONST 0 (None)
69 RETURN_VALUE
...and for f3()
...
0 SETUP_LOOP 60 (to 63)
3 LOAD_GLOBAL 0 (range)
6 LOAD_CONST 1 (10)
9 CALL_FUNCTION 1
12 GET_ITER
>> 13 FOR_ITER 46 (to 62)
16 STORE_FAST 0 (i)
19 LOAD_FAST 0 (i)
22 LOAD_CONST 2 (5)
25 COMPARE_OP 0 (<)
28 POP_JUMP_IF_FALSE 39
31 LOAD_CONST 3 ('i is less than 5')
34 PRINT_ITEM
35 PRINT_NEWLINE
36 JUMP_ABSOLUTE 13
>> 39 LOAD_FAST 0 (i)
42 LOAD_CONST 2 (5)
45 COMPARE_OP 4 (>)
48 POP_JUMP_IF_FALSE 13
51 LOAD_CONST 4 ('i is greater than 5')
54 PRINT_ITEM
55 PRINT_NEWLINE
56 JUMP_ABSOLUTE 13
59 JUMP_ABSOLUTE 13
>> 62 POP_BLOCK
>> 63 LOAD_CONST 0 (None)
66 RETURN_VALUE
...the only significant difference being 39 JUMP_FORWARD
in the first example which can never actually be executed, due to a preceding unconditional jump, but using this to decide which is 'better' is taking optimization to a ridiculous level.
The two examples f2()
and f3()
only really differ stylistically. Consider...
def f4():
while 1:
if expr1:
do_something()
else:
if expr2:
do_something_else()
if expr3:
do_a_different_thing()
def f5():
while 1:
if expr1:
do_something()
continue
if expr2:
do_something_else()
if expr3:
do_a_different_thing()
...which are functionally identical, but f5()
minimizes the indentation level of nested blocks, which can be more desirable for narrower fixed-width displays. As for which is more 'readable', that's purely subjective.
Back to your code, you could have equally expressed it as...
def insertion_sort(array):
sublist = [array[0]]
for i in range(1, len(array)):
if array[i] >= sublist[i-1]:
sublist.append(array[i])
elif array[i] <= sublist[0]:
sublist.insert(0, array[i])
else:
for j in reversed(range(0, i)):
if array[i] >= sublist[j-1] and array [i] <=sublist[j]:
sublist.insert(j, array[i])
break #possible performance loss here
return sublist
...which would eliminate the continue
s, but as the dis
module demonstrates, there is no performance advantage in doing so. As for the break
, as far as I can tell, that's required for the algorithm to function correctly.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With