Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How Python execute [list] * num? what's time complexity and memory complexity?

I am new to Python. I was recently confused by a syntax "[list] * k". I want to understand how Python actually executes it. Example:

>>> l = [1, 2] * 10
>>> l
[1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2]

Assume len(list) = n, when Python interprets it, I have following guesses with my limited knowledge.

  1. it uses list.extend(list) method. Thus it will take up O(n * k) space and use O(n * k) time.

  2. it only copies the reference of the original list and make k copy of it. Thus it will take up O(k) space and use O(k) time.

If my 2nd guess is the case, why and how following statement works?

>>> l[3] = 100
>>> l
[1, 2, 1, 100, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2]

Any official design doc, source code and PEP reference is strongly welcomed!

Follow up,

The source code link provided by @JoshLee and @MSeifert is very helpful in understanding the internal mechanism. See here at list_repeat

The following code snippet confirms the space complexity is O(n * k).

size = Py_SIZE(a) * n;

Also following code snippet confirms the time complexity is O(n * k).

p = np->ob_item;
items = a->ob_item;
for (i = 0; i < n; i++) {
    for (j = 0; j < Py_SIZE(a); j++) {
        *p = items[j];
        Py_INCREF(*p);
        p++;
    }
}
like image 947
airwindow Avatar asked Jan 06 '18 16:01

airwindow


1 Answers

A list is a shallow wrapper around an array of pointers to Python objects. So a Python list containing 1, 2, 3 would look like this:

l = [1, 2, 3]

enter image description here

If you multiply it by 5 the indices would still reference the same item, for example index 0, 3, 6, 9, 12 would store the same memory address (which explains why changing one item doesn't changed them all):

l2 = l * 5

enter image description here

However when the objects pointed to by the list were mutable, a change (as opposed to a replacement like in your example) to one of the items would be reflected in all of them:

>>> ls = [{1}, {2}, {3}]
>>> ls2 = ls*3
>>> ls[0].add(2)
>>> ls2
[{1, 2}, {2}, {3}, {1, 2}, {2}, {3}, {1, 2}, {2}, {3}]

So while this has a space and time complexity of O(n*k) it isn't as bad as it were if you would create a list containing pointers to distinct objects:

[int(i % 3 + 1) for i in range(15)]

enter image description here

Note that CPython actually reuses small integers - so the last images is inaccurate when dealing with integers like 1, 2, 3, .... It would be like the second image - but for other classes they (with a few more exceptions) would be distinct objects. However this affects only the factors, so would disappear in the O notation anyway, but it's nice to know.

The list-multiplication code is written in C (like a lot of the built-in functions and types) and you can see it here (CPython 3.6.4):

static PyObject *
list_repeat(PyListObject *a, Py_ssize_t n)
{
    Py_ssize_t i, j;
    Py_ssize_t size;
    PyListObject *np;
    PyObject **p, **items;
    PyObject *elem;
    if (n < 0)
        n = 0;
    if (n > 0 && Py_SIZE(a) > PY_SSIZE_T_MAX / n)
        return PyErr_NoMemory();

    /* Create an empty list: Space complexity: k * n */
    size = Py_SIZE(a) * n;
    if (size == 0)
        return PyList_New(0);
    np = (PyListObject *) PyList_New(size);
    if (np == NULL)
        return NULL;

    items = np->ob_item;

    /* Special case: The multiplied list contains one item. Time complexity: k */
    if (Py_SIZE(a) == 1) {
        elem = a->ob_item[0];
        for (i = 0; i < n; i++) {
            items[i] = elem;
            Py_INCREF(elem);
        }
        return (PyObject *) np;
    }

    /* General case: The multiplied list contains more than one item: Time complexity: n * k */
    p = np->ob_item;
    items = a->ob_item;
    for (i = 0; i < n; i++) {
        for (j = 0; j < Py_SIZE(a); j++) {
            *p = items[j];
            Py_INCREF(*p);
            p++;
        }
    }
    return (PyObject *) np;
}

The comments were added by me, just to explain the (undocumented) source code.

like image 199
MSeifert Avatar answered Sep 28 '22 05:09

MSeifert