For the implementation of a list a contiguous array of references to other objects is used. Python keeps a pointer to this array and the array's length is stored in a list head structure. This makes indexing of a list independent of the size of the list or the value of the index.
There are two general-purpose List implementations — ArrayList and LinkedList . Most of the time, you'll probably use ArrayList , which offers constant-time positional access and is just plain fast. It does not have to allocate a node object for each element in the List , and it can take advantage of System.
A list is a data structure in Python that is a mutable, or changeable, ordered sequence of elements. Each element or value that is inside of a list is called an item. Just as strings are defined as characters between quotes, lists are defined by having values between square brackets [ ] .
Java LinkedList internal implementation - linkLast() method linkLast() method is used to insert element as the last element of the list. In that case the node which is currently the last node of the linked list will become the second last node.
The C code is pretty simple, actually. Expanding one macro and pruning some irrelevant comments, the basic structure is in listobject.h
, which defines a list as:
typedef struct {
PyObject_HEAD
Py_ssize_t ob_size;
/* Vector of pointers to list elements. list[0] is ob_item[0], etc. */
PyObject **ob_item;
/* ob_item contains space for 'allocated' elements. The number
* currently in use is ob_size.
* Invariants:
* 0 <= ob_size <= allocated
* len(list) == ob_size
* ob_item == NULL implies ob_size == allocated == 0
*/
Py_ssize_t allocated;
} PyListObject;
PyObject_HEAD
contains a reference count and a type identifier. So, it's a vector/array that overallocates. The code for resizing such an array when it's full is in listobject.c
. It doesn't actually double the array, but grows by allocating
new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);
new_allocated += newsize;
to the capacity each time, where newsize
is the requested size (not necessarily allocated + 1
because you can extend
by an arbitrary number of elements instead of append
'ing them one by one).
See also the Python FAQ.
It's a dynamic array. Practical proof: Indexing takes (of course with extremely small differences (0.0013 µsecs!)) the same time regardless of index:
...>python -m timeit --setup="x = [None]*1000" "x[500]"
10000000 loops, best of 3: 0.0579 usec per loop
...>python -m timeit --setup="x = [None]*1000" "x[0]"
10000000 loops, best of 3: 0.0566 usec per loop
I would be astounded if IronPython or Jython used linked lists - they would ruin the performance of many many widely-used libraries built on the assumption that lists are dynamic arrays.
I would suggest Laurent Luce's article "Python list implementation". Was really useful for me because the author explains how the list is implemented in CPython and uses excellent diagrams for this purpose.
List object C structure
A list object in CPython is represented by the following C structure.
ob_item
is a list of pointers to the list elements. allocated is the number of slots allocated in memory.typedef struct { PyObject_VAR_HEAD PyObject **ob_item; Py_ssize_t allocated; } PyListObject;
It is important to notice the difference between allocated slots and the size of the list. The size of a list is the same as
len(l)
. The number of allocated slots is what has been allocated in memory. Often, you will see that allocated can be greater than size. This is to avoid needing callingrealloc
each time a new elements is appended to the list.
...
Append
We append an integer to the list:
l.append(1)
. What happens?We continue by adding one more element:
l.append(2)
.list_resize
is called with n+1 = 2 but because the allocated size is 4, there is no need to allocate more memory. Same thing happens when we add 2 more integers:l.append(3)
,l.append(4)
. The following diagram shows what we have so far.
...
Insert
Let’s insert a new integer (5) at position 1:
l.insert(1,5)
and look at what happens internally.
...
Pop
When you pop the last element:
l.pop()
,listpop()
is called.list_resize
is called insidelistpop()
and if the new size is less than half of the allocated size then the list is shrunk.You can observe that slot 4 still points to the integer but the important thing is the size of the list which is now 4. Let’s pop one more element. In
list_resize()
, size – 1 = 4 – 1 = 3 is less than half of the allocated slots so the list is shrunk to 6 slots and the new size of the list is now 3.You can observe that slot 3 and 4 still point to some integers but the important thing is the size of the list which is now 3.
...
Remove Python list object has a method to remove a specific element:
l.remove(5)
.
This is implementation dependent, but IIRC:
ArrayList
Thus they all have O(1) random access.
In CPython, lists are arrays of pointers. Other implementations of Python may choose to store them in different ways.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With