I am writing a program, that does a lot of deletions at either the front or back of a list of data, never the middle. I understand that deletion of the last element is cheap, but how about deletion of the first element? For example let's say list <code>A</code>'s address is at <code>4000</code>, so element <code>0</code> is at <code>4000</code> and element <code>1</code> is at <code>4001</code>. Would deleting element <code>0</code> then just make the compiler put list <code>A</code>'s address at <code>4001</code>, or would it shift element <code>1</code> at <code>4001</code> to the location at <code>4000</code>, and shift all other elements down by <code>1</code>?

No, it isn't cheap. Removing an element from the front of the list (using <code>list.pop(0)</code>, for example) is an <code>O(N)</code> operation and should be avoided. Similarly, inserting elements at the beginning (using <code>list.insert(0, <value>)</code>) is equally inefficient. This is because, after the list is resized, it's elements must be shifted. For CPython, in the <code>l.pop(0)</code> case, this is done with <code>memmove</code> while for <code>l.insert(0, <value>)</code>, the shifting is implemented with a loop through the items stored. Lists are built for fast random access and <code>O(1)</code> operations on their end. <hr> Since you're doing this operation commonly, though, you should consider using a <code>deque</code> from the <code>collections</code> module (as @ayhan suggested in a comment). The docs on <code>deque</code> also highlight how <code>list</code> objects aren't suitable for these operations: <blockquote> Though list objects support similar operations, they are optimized for fast fixed-length operations and incur <code>O(n)</code> memory movement costs for <code>pop(0)</code> and <code>insert(0, v)</code> operations which change both the size and position of the underlying data representation. </blockquote> (Emphasis mine) The <code>deque</code> data structure offers <code>O(1)</code> complexity for both sides (beginning and end) with <code>appendleft</code>/<code>popleft</code> and <code>append</code>/<code>pop</code> methods for the beginning and end respectively. Of course, with small sizes this incurs some extra space requirements (due to the structure of the <code>deque</code>) which should generally be of no concern (and as @juanpa noted in a comment, doesn't always hold) as the sizes of the lists grow. Finally, as @ShadowRanger's insightful comment notes, with really small sequence sizes the problem of popping or inserting from the front is trivialized to the point that it becomes of really no concern. So, in short, for lists with many items, use <code>deque</code> if you need fast appends/pops from both sides, else, if you're randomly accessing and appending to the end, use <code>list</code>s.

Is removing an element from the front of a list cheap in Python?

Tags:

python

list

optimization

python-3.x

I am writing a program, that does a lot of deletions at either the front or back of a list of data, never the middle.

I understand that deletion of the last element is cheap, but how about deletion of the first element? For example let's say list A's address is at 4000, so element 0 is at 4000 and element 1 is at 4001.

Would deleting element 0 then just make the compiler put list A's address at 4001, or would it shift element 1 at 4001 to the location at 4000, and shift all other elements down by 1?

432

asked Feb 02 '17 22:02

Abid Rizvi

2 Answers

No, it isn't cheap. Removing an element from the front of the list (using list.pop(0), for example) is an O(N) operation and should be avoided. Similarly, inserting elements at the beginning (using list.insert(0, <value>)) is equally inefficient.

This is because, after the list is resized, it's elements must be shifted. For CPython, in the l.pop(0) case, this is done with memmove while for l.insert(0, <value>), the shifting is implemented with a loop through the items stored.

Lists are built for fast random access and O(1) operations on their end.

Since you're doing this operation commonly, though, you should consider using a deque from the collections module (as @ayhan suggested in a comment). The docs on deque also highlight how list objects aren't suitable for these operations:

Though list objects support similar operations, they are optimized for fast fixed-length operations and incur O(n) memory movement costs for pop(0) and insert(0, v) operations which change both the size and position of the underlying data representation.

^{(Emphasis mine)}

The deque data structure offers O(1) complexity for both sides (beginning and end) with appendleft/popleft and append/pop methods for the beginning and end respectively.

Of course, with small sizes this incurs some extra space requirements (due to the structure of the deque) which should generally be of no concern (and as @juanpa noted in a comment, doesn't always hold) as the sizes of the lists grow. Finally, as @ShadowRanger's insightful comment notes, with really small sequence sizes the problem of popping or inserting from the front is trivialized to the point that it becomes of really no concern.

So, in short, for lists with many items, use deque if you need fast appends/pops from both sides, else, if you're randomly accessing and appending to the end, use lists.

160

answered Sep 17 '22 00:09

Dimitris Fasarakis Hilliard

Removing elements from the front of a list in Python is O(n), while removing elements from the ends of a collections.deque is only O(1). A deque would be great for your purpose as a result, however it should be noted that accessing or adding/removing from the middle of a deque is more costly than for a list.

The O(n) cost for removal is because a list in CPython is simply implemented as an array of pointers, thus your intuition regarding the shifting cost for each element is correct.

This can be seen in the Python TimeComplexity page on the Wiki.

answered Sep 18 '22 00:09

miradulo

Related questions
                            
                                Any yaml libraries in Python that support dumping of long strings as block literals or folded blocks?
                            
                                Is it possible to show the exact position in Sublime Text 2?
                            
                                Pandas: Sorting columns by their mean value
                            
                                list comprehension replace for loop in 2D matrix
                            
                                'function' object has no attribute 'name' when registering blueprint
                            
                                `ValueError: A value in x_new is above the interpolation range.` - what other reasons than not ascending values?
                            
                                Auto __repr__ method
                            
                                Django: Overriding __init__ for Custom Forms
                            
                                Display graph without saving using pydot
                            
                                What is the simplest way to swap each pair of adjoining chars in a string with Python?
                            
                                Python vs Matlab [closed]
                            
                                How to get the difference of two querysets in Django?
                            
                                csv.write skipping lines when writing to csv
                            
                                How to get the highest element in absolute value in a numpy matrix?
                            
                                Python to automatically select serial ports (for Arduino)
                            
                                Capturing high multi-collinearity in statsmodels
                            
                                Unmelt Pandas DataFrame
                            
                                Django Rest Framework model Id field in nested relationship serializer
                            
                                Removing elements from an array that are in another array
                            
                                How to initialise only optimizer variables in Tensorflow?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With