Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does python extend work?

If I have a list:

a = [1,2,3,4]

and then add 4 elements using extend

a.extend(range(5,10))

I get

a = [1, 2, 3, 4, 5, 6, 7, 8, 9]

How does python do this? does it create a new list and copy the elements across or does it make 'a' bigger? just concerned that using extend will gobble up memory. I'am also asking as there is a comment in some code I'm revising that extending by 10000 x 100 is quicker than doing it in one block of 1000000.

like image 842
Martlark Avatar asked Apr 12 '26 16:04

Martlark


1 Answers

L.extend(M) is amortized O(n) where n=len(m), so excessive copying is not usually a problem. The times it can be a problem is when there is not enough space to extend into, so a copy is performed. This is a problem when the list is large and you have limits on how much time is acceptable for an individual extend call.

That is the point when you should look for a more efficient datastructure for your problem. I find it is rarely a problem in practice.

Here is the relevant code from CPython, you can see that extra space is allocated when the list is extended to avoid excessive copying

static PyObject *
listextend(PyListObject *self, PyObject *b)
{
    PyObject *it;      /* iter(v) */
    Py_ssize_t m;                  /* size of self */
    Py_ssize_t n;                  /* guess for size of b */
    Py_ssize_t mn;                 /* m + n */
    Py_ssize_t i;
    PyObject *(*iternext)(PyObject *);

    /* Special cases:
       1) lists and tuples which can use PySequence_Fast ops
       2) extending self to self requires making a copy first
    */
    if (PyList_CheckExact(b) || PyTuple_CheckExact(b) || (PyObject *)self == b) {
        PyObject **src, **dest;
        b = PySequence_Fast(b, "argument must be iterable");
        if (!b)
            return NULL;
        n = PySequence_Fast_GET_SIZE(b);
        if (n == 0) {
            /* short circuit when b is empty */
            Py_DECREF(b);
            Py_RETURN_NONE;
        }
        m = Py_SIZE(self);
        if (list_resize(self, m + n) == -1) {
            Py_DECREF(b);
            return NULL;
        }
        /* note that we may still have self == b here for the
         * situation a.extend(a), but the following code works
         * in that case too.  Just make sure to resize self
         * before calling PySequence_Fast_ITEMS.
         */
        /* populate the end of self with b's items */
        src = PySequence_Fast_ITEMS(b);
        dest = self->ob_item + m;
        for (i = 0; i < n; i++) {
            PyObject *o = src[i];
            Py_INCREF(o);
            dest[i] = o;
        }
        Py_DECREF(b);
        Py_RETURN_NONE;
    }

    it = PyObject_GetIter(b);
    if (it == NULL)
        return NULL;
    iternext = *it->ob_type->tp_iternext;

    /* Guess a result list size. */
    n = _PyObject_LengthHint(b, 8);
    if (n == -1) {
        Py_DECREF(it);
        return NULL;
    }
    m = Py_SIZE(self);
    mn = m + n;
    if (mn >= m) {
        /* Make room. */
        if (list_resize(self, mn) == -1)
            goto error;
        /* Make the list sane again. */
        Py_SIZE(self) = m;
    }
    /* Else m + n overflowed; on the chance that n lied, and there really
     * is enough room, ignore it.  If n was telling the truth, we'll
     * eventually run out of memory during the loop.
     */

    /* Run iterator to exhaustion. */
    for (;;) {
        PyObject *item = iternext(it);
        if (item == NULL) {
            if (PyErr_Occurred()) {
                if (PyErr_ExceptionMatches(PyExc_StopIteration))
                    PyErr_Clear();
                else
                    goto error;
            }
            break;
        }
        if (Py_SIZE(self) < self->allocated) {
            /* steals ref */
            PyList_SET_ITEM(self, Py_SIZE(self), item);
            ++Py_SIZE(self);
        }
        else {
            int status = app1(self, item);
            Py_DECREF(item);  /* append creates a new ref */
            if (status < 0)
                goto error;
        }
    }

    /* Cut back result list if initial guess was too large. */
    if (Py_SIZE(self) < self->allocated)
        list_resize(self, Py_SIZE(self));  /* shrinking can't fail */

    Py_DECREF(it);
    Py_RETURN_NONE;

  error:
    Py_DECREF(it);
    return NULL;
}

PyObject *
_PyList_Extend(PyListObject *self, PyObject *b)
{
    return listextend(self, b);
}
like image 122
John La Rooy Avatar answered Apr 15 '26 05:04

John La Rooy



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!