Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Comparing list comprehensions and explicit loops (3 array generators faster than 1 for loop)

I did homework and I accidentally found a strange inconsistency in the speed of the algorithm. Here is 2 versions of code of same function bur with 1 difference: in first version i use 3 times array generator to filter some array and in second version i use 1 for loop with 3 if statements to do same filter work.

So, here is code of 1st version:

def kth_order_statistic(array, k):
    pivot = (array[0] + array[len(array) - 1]) // 2
    l = [x for x in array if x < pivot]
    m = [x for x in array if x == pivot]
    r = [x for x in array if x > pivot]
    if k <= len(l):
            return kth_order_statistic(l, k)
    elif k > len(l) + len(m):
            return kth_order_statistic(r, k - len(l) - len(m))
    else:
            return m[0]

And here code of 2nd version:

def kth_order_statistic2(array, k):
    pivot = (array[0] + array[len(array) - 1]) // 2
    l = []
    m = []
    r = []
    for x in array:
        if x < pivot:
            l.append(x)
        elif x > pivot:
            r.append(x)
        else:
            m.append(x)

    if k <= len(l):
        return kth_order_statistic2(l, k)
    elif k > len(l) + len(m):
        return kth_order_statistic2(r, k - len(l) - len(m))
    else:
        return m[0]

IPython output for 1st version:

In [4]: %%timeit
   ...: A = range(100000)
   ...: shuffle(A)
   ...: k = randint(1, len(A)-1)
   ...: order_statisctic(A, k)
   ...:
10 loops, best of 3: 120 ms per loop

And for 2nd version:

In [5]: %%timeit
   ...: A = range(100000)
   ...: shuffle(A)
   ...: k = randint(1, len(A)-1)
   ...: kth_order_statistic2(A, k)
   ...:
10 loops, best of 3: 169 ms per loop

So why first version is faster than second? I also made third version wich using filter() function instead of array generator and it was slower than second version (it got 218 ms per loop)

like image 326
KgOfHedgehogs Avatar asked Sep 15 '16 19:09

KgOfHedgehogs


1 Answers

Using simple for is faster than list comprehesion. It is almost 2 times faster. Check below results:

Using list comprehension: 58 usec

moin@moin-pc:~$ python -m timeit "[i for i in range(1000)]"
10000 loops, best of 3: 58 usec per loop

Using for loop: 37.1 usec

moin@moin-pc:~$ python -m timeit "for i in range(1000): i"
10000 loops, best of 3: 37.1 usec per loop

But in your case, for is taking more time than list comprehension not because YOUR for loop is slow. But because of .append() you are using within the code.

With append() in for loop`: 114 usec

moin@moin-pc:~$ python -m timeit "my_list = []" "for i in range(1000): my_list.append(i)"
10000 loops, best of 3: 114 usec per loop

Which clearly shows that it is .append() which is taking twice the time taken by for loop.

However, on storing the "list.append" in different variable: 69.3 usec

moin@moin-pc:~$ python -m timeit "my_list = []; append = my_list.append" "for i in range(1000): append(i)"
10000 loops, best of 3: 69.3 usec per loop

There is a huge improvement in the performance as compared to the last case in above comparisons, and result is quite comparable to that of list comprehension. That means, instead of calling my_list.append() each time, the performance can be improved by storing the reference of function in another variable i.e append_func = my_list.append and making a call using that variable append_func(i).

Which also proves, it is faster to call class's function stored in the variable as compared to directly making the function call using object of the class.

Thank You Stefan for bringing the last case in notice.

like image 88
Moinuddin Quadri Avatar answered Oct 08 '22 13:10

Moinuddin Quadri