Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why Python doesn't have a native Linked List implementation?

I have tried some quick experiment comparing the performance of native Python lists with linked lists implementations such as this.

The native python lists are always faster than the non native linked list in the cases where they should not be (according the theory).

from linkedlist import *
import time
a = LinkedList()
b = []
for i in range(1000000):
    b.append(i)
    a.add_first(i)
t0 = time.clock()
a.remove(10)
t1 = time.clock()
b.remove(10)
t2 = time.clock()
print t1-t0
print t2-t1

The results I have on the test above are:

  • native linked list = 2.00000000001e-05

  • python list = 0.005576

  • non native linked list = 3.90000000001e-05

So, I was wondering why Python don't have a native Linked List data structure. In the case of Python, it looks to me that it could be useful algorithmically speaking to have Linked List instead of the standard Lists to speed up some aspects of the standard library.

My understanding is that the List data structure is a key building block of the language and that it makes the code more maintainable and easily optimizable to focus on that very data structure.

Is there any other reason?

like image 406
lc2817 Avatar asked Dec 12 '12 07:12

lc2817


People also ask

Does Python have a linked list implementation?

To start with Python, it does not have a linked list library built into it like the classical programming languages. Python does have an inbuilt type list that works as a dynamic array but its operation shouldn't be confused with a typical function of a linked list.

Which programming language is best for linked list?

Linked List Utility. Lists are one of the most popular and efficient data structures, with implementation in every programming language like C, C++, Python, Java, and C#.

Why linked list is rarely used?

They are too low level. You already have the array type, which is mostly implemented in native code and useful in the general case. One benefit of linked lists is quick removal of elements.

Which is not suitable for implementation using a linked list?

Using linked list binary search will take O(n) time. So binary search is inefficient with linked list.


1 Answers

Indeed, Python doesn't have a native linked list implementation, and I would also love to know why.

One alternative you may want to consider is collections.deque (= "double-ended queue") which provides very fast constant-time O(1) insertion at both ends. In fact, for your specific example, a deque is a better choice than a linked list, since you don't need insertion in the middle.

However, in general, it is important to keep in mind that a deque is a different data structure than a linked list. A linked list also provides constant-time O(1) insertion in the middle, while a deque only provides linear-time O(n) insertion in the middle. In other words, the time it takes to insert an element in the middle of a deque is proportional to the current length n of the deque, and for a sufficiently large n, it will be slower than with a linked list.

Comparison between collections.deque and a true linked list

Internally, collections.deque is in fact implemented using a linked list. However, this is an implementation detail, and more importantly, it is implemented as a linked list of fixed-size blocks of elements, not as a linked list of individual elements.

This is why insertion in the middle of a collections.deque is O(n) rather than O(1): you still need to modify around half the memory of the whole deque to accommodate a new element N:

before inserting N: [ABCDE]⇄[FGHIJ]⇄[KLM__]
after inserting N:  [ABCDE]⇄[FGNHI]⇄[JKLM_]
changed memory:                ^^^   ^^^^

Instead, with a true linked list (= of individual elements), inserting a new element N in the middle simply consists of allocating a new node and updating the value of four pointers, which is an operation whose performance is independent of the current size of the linked list:

before inserting N: [A]⇄[B]⇄[C]⇄[D]⇄[E]⇄[F]⇄[G]⇄    [H]⇄[I]⇄[J]⇄[K]⇄[L]⇄[M]
after inserting N:  [A]⇄[B]⇄[C]⇄[D]⇄[E]⇄[F]⇄[G]⇄[N]⇄[H]⇄[I]⇄[J]⇄[K]⇄[L]⇄[M]
changed memory:                                ^ ^ ^                           

The trade-off is that the deque has better memory locality and requires less independent memory allocations. For example, inserting the new element N in the deque above didn't require any new memory allocation at all. This is why in practice, and especially if you're often inserting at the ends rather than in the middle, a deque is in fact a better choice than a linked list.

Note that while inserting elements in the middle of a deque is O(n), inserting new elements at the beginning or the end is O(1):

before:                [ABCDE]⇄[FGNHI]⇄[JKLM_]

prepending P:  [____P]⇄[ABCDE]⇄[FGNHI]⇄[JKLM_]
                    ^ ^

prepending Q:  [___QP]⇄[ABCDE]⇄[FGNHI]⇄[JKLM_]
                   ^

appending R:   [___QP]⇄[ABCDE]⇄[FGNHI]⇄[JKLMR]
                                            ^

appending S:   [___QP]⇄[ABCDE]⇄[FGNHI]⇄[JKLMR]⇄[S____]
                                              ^ ^

Caveats

Of course, for the linked list insertion to actually be O(1), this assumes that you already have a handle h to the node before or after which you want to insert your new node n. On a hypothetical implementation of LinkedList, this could look like:

n = linkedlist.insertbefore(h, "some value")

where:

type(h)     # => <class 'Node'>
type(n)     # => <class 'Node'>
n.value     # => "some value"
n.next == h # => True

If you do not have such handle, then a function like insert(i, x) will still be O(n), because finding the i-th element is O(n), even though the insert operation itself is O(1). Here is some hypothetical implementation of insert(i, x) on our hypothetical LinkedList:

def insert(self, i, x):
    node = self.node_from_index(i)       # Find the i-th node: O(n)
    return self.insertbefore(node, x)    # Insert and return new node: O(1)

This means that linked lists are only worth it for certain problems when you do keep those node handles around. It also makes the API a bit less convenient, and despite each operation being O(1) if you're careful, the constant is generally much bigger. So in practice, they are not that often useful, which may be why they're aren't built-in linked lists in Python.

like image 65
Boris Dalstein Avatar answered Oct 12 '22 22:10

Boris Dalstein