Why Python doesn't have a native Linked List implementation?

Tags:

I have tried some quick experiment comparing the performance of native Python lists with linked lists implementations such as this.

The native python lists are always faster than the non native linked list in the cases where they should not be (according the theory).

from linkedlist import *
import time
a = LinkedList()
b = []
for i in range(1000000):
    b.append(i)
    a.add_first(i)
t0 = time.clock()
a.remove(10)
t1 = time.clock()
b.remove(10)
t2 = time.clock()
print t1-t0
print t2-t1

The results I have on the test above are:

native linked list = 2.00000000001e-05
python list = 0.005576
non native linked list = 3.90000000001e-05

So, I was wondering why Python don't have a native Linked List data structure. In the case of Python, it looks to me that it could be useful algorithmically speaking to have Linked List instead of the standard Lists to speed up some aspects of the standard library.

My understanding is that the List data structure is a key building block of the language and that it makes the code more maintainable and easily optimizable to focus on that very data structure.

Is there any other reason?

406

asked Dec 12 '12 07:12

lc2817

1 Answers

Indeed, Python doesn't have a native linked list implementation, and I would also love to know why.

One alternative you may want to consider is collections.deque (= "double-ended queue") which provides very fast constant-time O(1) insertion at both ends. In fact, for your specific example, a deque is a better choice than a linked list, since you don't need insertion in the middle.

However, in general, it is important to keep in mind that a deque is a different data structure than a linked list. A linked list also provides constant-time O(1) insertion in the middle, while a deque only provides linear-time O(n) insertion in the middle. In other words, the time it takes to insert an element in the middle of a deque is proportional to the current length n of the deque, and for a sufficiently large n, it will be slower than with a linked list.

Comparison between collections.deque and a true linked list

Internally, collections.deque is in fact implemented using a linked list. However, this is an implementation detail, and more importantly, it is implemented as a linked list of fixed-size blocks of elements, not as a linked list of individual elements.

This is why insertion in the middle of a collections.deque is O(n) rather than O(1): you still need to modify around half the memory of the whole deque to accommodate a new element N:

before inserting N: [ABCDE]⇄[FGHIJ]⇄[KLM__]
after inserting N:  [ABCDE]⇄[FGNHI]⇄[JKLM_]
changed memory:                ^^^   ^^^^

Instead, with a true linked list (= of individual elements), inserting a new element N in the middle simply consists of allocating a new node and updating the value of four pointers, which is an operation whose performance is independent of the current size of the linked list:

before inserting N: [A]⇄[B]⇄[C]⇄[D]⇄[E]⇄[F]⇄[G]⇄    [H]⇄[I]⇄[J]⇄[K]⇄[L]⇄[M]
after inserting N:  [A]⇄[B]⇄[C]⇄[D]⇄[E]⇄[F]⇄[G]⇄[N]⇄[H]⇄[I]⇄[J]⇄[K]⇄[L]⇄[M]
changed memory:                                ^ ^ ^

The trade-off is that the deque has better memory locality and requires less independent memory allocations. For example, inserting the new element N in the deque above didn't require any new memory allocation at all. This is why in practice, and especially if you're often inserting at the ends rather than in the middle, a deque is in fact a better choice than a linked list.

Note that while inserting elements in the middle of a deque is O(n), inserting new elements at the beginning or the end is O(1):

before:                [ABCDE]⇄[FGNHI]⇄[JKLM_]

prepending P:  [____P]⇄[ABCDE]⇄[FGNHI]⇄[JKLM_]
                    ^ ^

prepending Q:  [___QP]⇄[ABCDE]⇄[FGNHI]⇄[JKLM_]
                   ^

appending R:   [___QP]⇄[ABCDE]⇄[FGNHI]⇄[JKLMR]
                                            ^

appending S:   [___QP]⇄[ABCDE]⇄[FGNHI]⇄[JKLMR]⇄[S____]
                                              ^ ^

Caveats

Of course, for the linked list insertion to actually be O(1), this assumes that you already have a handle h to the node before or after which you want to insert your new node n. On a hypothetical implementation of LinkedList, this could look like:

n = linkedlist.insertbefore(h, "some value")

where:

type(h)     # => <class 'Node'>
type(n)     # => <class 'Node'>
n.value     # => "some value"
n.next == h # => True

If you do not have such handle, then a function like insert(i, x) will still be O(n), because finding the i-th element is O(n), even though the insert operation itself is O(1). Here is some hypothetical implementation of insert(i, x) on our hypothetical LinkedList:

def insert(self, i, x):
    node = self.node_from_index(i)       # Find the i-th node: O(n)
    return self.insertbefore(node, x)    # Insert and return new node: O(1)

This means that linked lists are only worth it for certain problems when you do keep those node handles around. It also makes the API a bit less convenient, and despite each operation being O(1) if you're careful, the constant is generally much bigger. So in practice, they are not that often useful, which may be why they're aren't built-in linked lists in Python.

answered Oct 12 '22 22:10

Boris Dalstein

Related questions
                            
                                Triple nested lists Python
                            
                                Adding a background color to Cell OpenPyXL
                            
                                django_rest_swagger - 'staticfiles' is not a registered tag library. Must be one of:
                            
                                How can I remove the fragment identifier from a URL?
                            
                                Convert spaces to %20 in list
                            
                                Missing Python.h while trying to compile a C extension module
                            
                                Python assigning multiple variables to same list value?
                            
                                Python reversing a string using recursion
                            
                                What are the limits of Python? [closed]
                            
                                Tensorflow error : DLL load failed: The specified procedure could not be found
                            
                                python remove duplicates from 2 lists
                            
                                python opencv TypeError: Layout of the output array incompatible with cv::Mat
                            
                                Pythonic way to implement a tokenizer
                            
                                python rtype docstring/restructured text for class factories/selectors
                            
                                Application-scope variables in Flask?
                            
                                Python setup.py to run shell script
                            
                                Multiple instances of celerybeat for autoscaled django app on elasticbeanstalk
                            
                                How to print a unicode string in python in Windows console [duplicate]
                            
                                detect pip in setup.py
                            
                                django/celery - celery status: Error: No nodes replied within time constraint

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why Python doesn't have a native Linked List implementation?

Tags:

python

arrays

algorithm

linked-list