Does python have a list constructor like the cons
in OCaml (::
) (or lisp) which takes a head
element and tail
list, and returns a new list head::tail
?
I searched for python list constructors, and ended up finding something else about __init__
. see e.g. Creating a list in Python- something sneaky going on?
To clarify, what I am looking for is the inverse of the following list decomposition in Python found in this question:
head, *tail = [1, 1, 2, 3, 5, 8, 13, 21, 34, 55]
This gives:
>>>head
1
>>> tail
[1, 2, 3, 5, 8, 13, 21, 34, 55]
I am looking for a list constructor e.g. cons
or ::
such that
head :: tail => original list
In practical terms there's no difference. I'd expect [] to be faster, because it does not involve a global lookup followed by a function call. Other than that, it's the same.
List. Lists are used to store multiple items in a single variable. Lists are one of 4 built-in data types in Python used to store collections of data, the other 3 are Tuple, Set, and Dictionary, all with different qualities and usage.
The string class is available by default in python, so you do not need an import statement to use the object interface to strings. Lists are objects too. An important method for lists is append(item).
list is not a keyword but a built-in type, as are str , set , dict , unicode , int , float , etc. There is no point in reserving each and every possible built-in type; python is a dynamic language and if you want to replace the built-in types with a local name that shadows it, you should be able to.
To answer your question, there is no direct equivalent to the cons that you would usually find in what are called "functional" languages (lisp, OCaml, Haskell, etc.)
That's because there are two competing models to represent lists of elements in programming languages.
The one you seem to be familiar with is called a linked list.
A linked list is composed of cons-cells that each contain two references :
Because lists are rarely infinite, the last const cell would usually point towards a special value, the empty list, sometimes called nil.
If you wanted to save the list in a variable for future reference, you would keep a reference towards the first cons-cell.
Here's a visual representation from Wikipedia.
In this model, every list is necessarily constructed by adding elements to the front by creating a new cons-cell, pointing to the new element as its head and to the previously constructed sublist as its tail. This is why the cons operator is sometimes called the list constructor.
This is the model usually preferred by imperative languages such as Python. In this model, a list is just a reference to a range in memory.
Let's suppose you create a list just like so :
l = [1, 2, 3]
Whenever you create a list, Python will assign to it a small range of memory in which to store the elements, with a bit of extra space just in case you want to add elements later. To store it, you would simply store a reference to the first element, and to the size of the memory range, just like so :
l <-- your variable
| ___ ___ ___ ___ ___ ___ ___ ___ ___
|-> | | | | | | | | | |
| 1 | 2 | 3 | | | | | | |
|___|___|___|___|___|___|___|___|___|
If you decide to add an element at the end of the list, you can use append
l.append(4)
Resulting in the following list :
___ ___ ___ ___ ___ ___ ___ ___ ___
| | | | | | | | | |
| 1 | 2 | 3 | 4 | | | | | |
|___|___|___|___|___|___|___|___|___|
Now, let's say you forgot an initial 0, and you now wish to add it to the front. You could really well use the insert method (with an insert position of 0) :
l.insert(0, 0)
But there's no space at the beginning of the list ! Python has no choice but to take each and every element, and copy them one at a time in the spot on the direct right :
___ ___ ___ ___ ___ ___ ___ ___ ___
| | | | | | | | | |
| 1 | 2 | 3 | 4 | | | | | |
|___|___|___|___|___|___|___|___|___|
| | |__ |___
| |___ | | First, Python has to copy the four elements
|___ | | | one space to the right
___ _\/ _\/ \/_ _\/ ___ ___ ___ ___
| | | | | | | | | |
| | 1 | 2 | 3 | 4 | | | | |
|___|___|___|___|___|___|___|___|___|
Only then can it insert the 0 at the beginning
___ ___ ___ ___ ___ ___ ___ ___ ___
| | | | | | | | | |
| 0 | 1 | 2 | 3 | | | | | |
|___|___|___|___|___|___|___|___|___|
It may not seem like much for such a small array, but imagine your array is much bigger, and you repeat this operation many times : you would spend so much time building your list !
That is why you won't find a list constructor in languages using arrays for their lists like Python.
You may now be wondering why different languages would prefer different list models, and if one of the two models is superior.
It's because these two data structures have different performance in different contexts. Two examples :
Let's suppose you want to get the fifth element of the list.
In the linked list you would need to get :
You would thus have to go through 5 references !
With an array, that's much simpler : you know the reference of the first element. You just need to access the reference 4 spots on the right, given the elements are all in a contiguous memory range !
If you need to access random elements of a very large list a great many times, arrays are thus much better.
Imagine you now want to insert an element in the middle.
With a linked list :
With an array, just as when adding an element in the middle, you will need to copy every element to the right of the insertion point one space to the right !
In this situation, that's the linked list that is clearly superior.
Some are suggesting that you do:
a = 1
b = [2, 3, 4, 5]
c = [a] + b
Which will work provided that b
is of type list
. However, often you'll find yourself working with an object that is either a list or a tuple or (insert your favorite iterable object here). In that case, it might be more worth your while to do:
a = 1
b = (2, 3, 4, 5)
c = [a]
c.extend(b)
This makes the whole thing agnostic to the type of b
(this allows your code to be more "ducky", which can be kind of nice). . .
Of course, there are other options as well. . . For example, You could choose to reach for itertools
:
import itertools
a = 1
b = [2, 3, 4, 5]
lazy_c = itertools.chain([a], b)
Again we have the benefit of not caring what type of iterable b
is, and we pick up a side benefit of iterating lazily -- We won't make a new list or tuple. We'll just make an object that will yield the same thing when you're iterating over it. This can save memory (and sometimes cycles for your CPU), but can also have un-intended consequences if b
is also a sort of generator-like object that is being iterated over elsewhere at the same time (which doesn't happen by accident very often).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With