Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does computational time decrease when removing unnecessary items from a list in Python

The past days I've been trying get a better understanding of computational complexity and how to improve Python code. For this I have tried out different functions for calculating Fibonacci numbers, comparing how long the script runs if I make small changes.

I'm calculating Fibonacci numbers using a list, adding the sum of element -2 and -1 from the list.

I was puzzled to find that if I add a .pop() in the loop, deleting not needed elements of my list, my script runs significantly faster. I don't see why this is. Each step in the loop the computer does one more thing. So my untrained intuition suggests that this should increase computational time. Is 'looking up' the last element of the list so much slower when the list is very long?

Here is my code:

import time
import numpy as np

def fib_stack1(n):
    """ Original function """
    assert type(n) is int, 'Expected an integer as input.'
    if n < 2:
        return n
    else:
        stack = [0, 1]
        for i in range(n-1):
            stack.append(stack[-1] + stack[-2])
        return stack[-1]

def fib_stack2(n):
    """ Modified function """
    assert type(n) is int, 'Expected an integer as input.'
    if n < 2:
        return n
    else:
        stack = [0, 1]
        for i in range(n-1):
            stack.append(stack[-1] + stack[-2])
            ### CHANGE ###
            stack.pop(-3)
            ##############
        return stack[-1] 


rec1 = []
rec2 = []
for _ in range(10):
    t1 = time.time()
    fib_stack1(99999)  
    t2 = time.time()
    rec1.append(t2-t1)
    t1 = time.time()
    fib_stack2(99999)  
    t2 = time.time()
    rec2.append(t2-t1)
print(np.array(rec1).mean())
print(np.array(rec2).mean())

The output is the following:

# Original 
0.26878631115
# Modified
0.145034956932
like image 544
Jérôme Bau Avatar asked Aug 11 '17 20:08

Jérôme Bau


People also ask

How do I remove a specific element from a list?

There are three ways in which you can Remove elements from List: Using the remove() method. Using the list object's pop() method. Using the del operator.

How do I remove multiple strings from a list in Python?

Remove Multiple elements from list by index range using del. Suppose we want to remove multiple elements from a list by index range, then we can use del keyword i.e. It will delete the elements in list from index1 to index2 – 1.

How do you remove part of a list in Python?

The pop() method removes an element at a given index, and will also return the removed item. You can also use the del keyword in Python to remove an element or slice from a list.


2 Answers

Is 'looking up' the last element of the list so much slower when the list is very long?

No, list length has no effect on lookup speed. These are arraylists, not linked lists. It's more likely that this is related to memory allocation or cache performance. The garbage collector is also involved.

When you delete unneeded list elements, Python never has to allocate a bigger buffer for the list. It may also be able to reuse the memory allocated for the int objects instead of requesting more memory from the OS. Considering how huge your integers get, reusing their memory is a big deal. (The details of memory allocation depend on the Python version and the underlying standard library allocator. Python 2 has a free list for ints, but not longs; Python 3 has no free list for ints. Python itself makes no effort to reuse allocations for large objects, but the underlying allocator might be doing something.)

Additionally, when you have to keep allocating new integers, especially ones as huge as the 99999th Fibonacci number, you're not going to get much benefit out of your CPU's cache. Main memory access is much slower than cache.

Finally, the allocation pattern of your fib_stack1 (lots of allocations, not so many object refcounts falling to 0) triggers Python's cycle-detector system, a.k.a. the garbage collector, which takes time to run and touches a lot of memory that didn't need touching, hurting cache performance. Temporarily disabling the collector produces a notable speedup for fib_stack1 in my own tests, especially on Python 3.

like image 56
user2357112 supports Monica Avatar answered Oct 23 '22 03:10

user2357112 supports Monica


A list stores its elements in memory in a contiguous way.

So the append method of the list object needs to resize the allocated memory block from time to time (not every time append is called, fortunately)

Sometimes, the system is able to resize "in-place" (allocates further memory just after the current memory block), and sometimes not: it has to find a contiguous block of memory big enough to store the new list.

When the resize is not "in-place", existing data needs to be copied. (Note that doesn't happen when the size of the list decreases)

So if there are less elements in the list when it's copied, operations are faster.

Note that list.append remains extremely fast. Adding at the end of a list is the fastest way (compared to insert which has to move elements each time to free its "slot")

like image 31
Jean-François Fabre Avatar answered Oct 23 '22 03:10

Jean-François Fabre