I did homework and I accidentally found a strange inconsistency in the speed of the algorithm. Here is 2 versions of code of same function bur with 1 difference: in first version i use 3 times array generator to filter some array and in second version i use 1 for loop with 3 if statements to do same filter work.
So, here is code of 1st version:
def kth_order_statistic(array, k):
pivot = (array[0] + array[len(array) - 1]) // 2
l = [x for x in array if x < pivot]
m = [x for x in array if x == pivot]
r = [x for x in array if x > pivot]
if k <= len(l):
return kth_order_statistic(l, k)
elif k > len(l) + len(m):
return kth_order_statistic(r, k - len(l) - len(m))
else:
return m[0]
And here code of 2nd version:
def kth_order_statistic2(array, k):
pivot = (array[0] + array[len(array) - 1]) // 2
l = []
m = []
r = []
for x in array:
if x < pivot:
l.append(x)
elif x > pivot:
r.append(x)
else:
m.append(x)
if k <= len(l):
return kth_order_statistic2(l, k)
elif k > len(l) + len(m):
return kth_order_statistic2(r, k - len(l) - len(m))
else:
return m[0]
IPython output for 1st version:
In [4]: %%timeit
...: A = range(100000)
...: shuffle(A)
...: k = randint(1, len(A)-1)
...: order_statisctic(A, k)
...:
10 loops, best of 3: 120 ms per loop
And for 2nd version:
In [5]: %%timeit
...: A = range(100000)
...: shuffle(A)
...: k = randint(1, len(A)-1)
...: kth_order_statistic2(A, k)
...:
10 loops, best of 3: 169 ms per loop
So why first version is faster than second? I also made third version wich using filter() function instead of array generator and it was slower than second version (it got 218 ms per loop)
Using simple for
is faster than list comprehesion
. It is almost 2 times faster. Check below results:
Using list comprehension
: 58 usec
moin@moin-pc:~$ python -m timeit "[i for i in range(1000)]"
10000 loops, best of 3: 58 usec per loop
Using for
loop: 37.1 usec
moin@moin-pc:~$ python -m timeit "for i in range(1000): i"
10000 loops, best of 3: 37.1 usec per loop
But in your case, for
is taking more time than list comprehension not because YOUR for loop is slow. But because of .append()
you are using within the code.
With append()
in for
loop`: 114 usec
moin@moin-pc:~$ python -m timeit "my_list = []" "for i in range(1000): my_list.append(i)"
10000 loops, best of 3: 114 usec per loop
Which clearly shows that it is .append()
which is taking twice the time taken by for
loop.
However, on storing the "list.append" in different variable
: 69.3 usec
moin@moin-pc:~$ python -m timeit "my_list = []; append = my_list.append" "for i in range(1000): append(i)"
10000 loops, best of 3: 69.3 usec per loop
There is a huge improvement in the performance as compared to the last case in above comparisons, and result is quite comparable to that of list comprehension
. That means, instead of calling my_list.append()
each time, the performance can be improved by storing the reference of function in another variable i.e append_func = my_list.append
and making a call using that variable append_func(i)
.
Which also proves, it is faster to call class's function stored in the variable as compared to directly making the function call using object of the class.
Thank You Stefan for bringing the last case in notice.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With