Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pythonic/Performance of continue in Python [closed]

I'm not sure if this should go in cs or programmers stack exchange instead, so if it should please let me know. I know that in other languages, continue can become problematic because it essentially performs an unconditional branch, which causes performance issues on pipelined machines because in order to pipeline effectively, we need to be able to know what instruction comes next, which unconditional branching can prevent. Which could be prevented by using more solid logic. That being said, I know that in Python, continue can improve performance within a loop, however, I was hoping for more than anecdotal evidence to support that. Can anyone elaborate on why that is the case in Python? Also, is the use of continue considered "proper" logic (i.e. Pythonic) within Python? Thanks!

Update

I'm also curious about the performance impact. As mentioned above, in other languages, continue is basically an unconditional branch. Obviously this can cause performance issues on pipelined machines. So I'm curious, is there a an inherent performance overhead when using continue in Python as well, even if there is a performance gain when used within a loop?

Here in the example code (an attempted rework of insertion sort, ignore the overall logic, this stemmed from some experimentation to attempt get better than O(n2) performance), if we have a case where we can determine where the next element should go (at the head or tail), we can easily put it there and ignore the rest of the loop using the continue statement

def insertion_sort(array):
    sublist = [array[0]]
    for i in range(1, len(array)):
        if array[i] >= sublist[i-1]:
            sublist.append(array[i])
            continue    #possible performance loss here
        if array[i] <= sublist[0]:
            sublist.insert(0, array[i])
            continue #possible performance loss here
        for j in reversed(range(0, i)):
            if array[i] >= sublist[j-1] and array [i] <=sublist[j]:
                sublist.insert(j, array[i])
                break #possible performance loss here
    return sublist
like image 695
AndrewSmiley Avatar asked Mar 12 '23 22:03

AndrewSmiley


1 Answers

In this case you're using continue as a means of "selection" in the case of mutually-exclusive conditions. Consider the following simpler examples...

def f1():
    for i in range(10):
        if i < 5:
            print 'i is less than 5'
        if i > 5:
            print 'i is greater than 5'

def f2():
    for i in range(10):
        if i < 5:
            print 'i is less than 5'
            continue
        if i > 5:
            print 'i is greater than 5'

The two functions will produce identical outputs, but in the first case, given that we know that i cannot be both greater than and less than 5 at the same time, the first function performs an unnecessary comparison, and will likely take longer to execute.

The second example is more commonly expressed as...

def f3():
    for i in range(10):
        if i < 5:
            print 'i is less than 5'
        elif i > 5:
            print 'i is greater than 5'

As for which is more performant, you can take a look at the Python dis module, which for f2() yields...

       0 SETUP_LOOP              63 (to 66)
       3 LOAD_GLOBAL              0 (range)
       6 LOAD_CONST               1 (10)
       9 CALL_FUNCTION            1
      12 GET_ITER
 >>   13 FOR_ITER                49 (to 65)
      16 STORE_FAST               0 (i)

      19 LOAD_FAST                0 (i)
      22 LOAD_CONST               2 (5)
      25 COMPARE_OP               0 (<)
      28 POP_JUMP_IF_FALSE       42

      31 LOAD_CONST               3 ('i is less than 5')
      34 PRINT_ITEM
      35 PRINT_NEWLINE

      36 JUMP_ABSOLUTE           13
      39 JUMP_FORWARD             0 (to 42)

 >>   42 LOAD_FAST                0 (i)
      45 LOAD_CONST               2 (5)
      48 COMPARE_OP               4 (>)
      51 POP_JUMP_IF_FALSE       13

      54 LOAD_CONST               4 ('i is greater than 5')
      57 PRINT_ITEM
      58 PRINT_NEWLINE
      59 JUMP_ABSOLUTE           13
      62 JUMP_ABSOLUTE           13
 >>   65 POP_BLOCK
 >>   66 LOAD_CONST               0 (None)
      69 RETURN_VALUE

...and for f3()...

       0 SETUP_LOOP              60 (to 63)
       3 LOAD_GLOBAL              0 (range)
       6 LOAD_CONST               1 (10)
       9 CALL_FUNCTION            1
      12 GET_ITER
 >>   13 FOR_ITER                46 (to 62)
      16 STORE_FAST               0 (i)

      19 LOAD_FAST                0 (i)
      22 LOAD_CONST               2 (5)
      25 COMPARE_OP               0 (<)
      28 POP_JUMP_IF_FALSE       39

      31 LOAD_CONST               3 ('i is less than 5')
      34 PRINT_ITEM
      35 PRINT_NEWLINE
      36 JUMP_ABSOLUTE           13

 >>   39 LOAD_FAST                0 (i)
      42 LOAD_CONST               2 (5)
      45 COMPARE_OP               4 (>)
      48 POP_JUMP_IF_FALSE       13

      51 LOAD_CONST               4 ('i is greater than 5')
      54 PRINT_ITEM
      55 PRINT_NEWLINE
      56 JUMP_ABSOLUTE           13
      59 JUMP_ABSOLUTE           13
 >>   62 POP_BLOCK
 >>   63 LOAD_CONST               0 (None)
      66 RETURN_VALUE

...the only significant difference being 39 JUMP_FORWARD in the first example which can never actually be executed, due to a preceding unconditional jump, but using this to decide which is 'better' is taking optimization to a ridiculous level.

The two examples f2() and f3() only really differ stylistically. Consider...

def f4():
    while 1:
        if expr1:
           do_something()

        else:
            if expr2:
                do_something_else()

            if expr3:
                do_a_different_thing()

def f5():
    while 1:
        if expr1:
           do_something()
           continue

        if expr2:
            do_something_else()

        if expr3:
            do_a_different_thing()

...which are functionally identical, but f5() minimizes the indentation level of nested blocks, which can be more desirable for narrower fixed-width displays. As for which is more 'readable', that's purely subjective.


Back to your code, you could have equally expressed it as...

def insertion_sort(array):
    sublist = [array[0]]
    for i in range(1, len(array)):
        if array[i] >= sublist[i-1]:
            sublist.append(array[i])
        elif array[i] <= sublist[0]:
            sublist.insert(0, array[i])
        else:
            for j in reversed(range(0, i)):
                if array[i] >= sublist[j-1] and array [i] <=sublist[j]:
                    sublist.insert(j, array[i])
                    break #possible performance loss here
    return sublist

...which would eliminate the continues, but as the dis module demonstrates, there is no performance advantage in doing so. As for the break, as far as I can tell, that's required for the algorithm to function correctly.

like image 111
Aya Avatar answered Mar 21 '23 03:03

Aya