Why does enumerate execute slower when not specifying the keyword start?

Question

I noticed the following odd behaviour when timing enumerate with the default start parameter specified:

In [23]: %timeit enumerate([1, 2, 3, 4])
The slowest run took 7.18 times longer than the fastest. This could mean that an intermediate result is being cached 
1000000 loops, best of 3: 511 ns per loop

In [24]: %timeit enumerate([1, 2, 3, 4], start=0)
The slowest run took 12.45 times longer than the fastest. This could mean that an intermediate result is being cached 
1000000 loops, best of 3: 1.22 µs per loop

So, approximately a 2x slowdown for the case where start is specified.

The byte code issued for each case doesn't really indicate anything that would contribute to the significant difference in speed. Case in point, after examining the different calls with dis.dis the additional commands issued are:

18 LOAD_CONST               5 ('start')
21 LOAD_CONST               6 (0)

These, along with the CALL_FUNCTION having 1 keyword, are the only differences.

I tried tracing through the calls made in CPythons ceval with gdb and both seem to use do_call in call_function and not some other optimization I could detect.

Now, I understand enumerate just creates an enumerate iterator, so we're dealing with object creation here (right?). I looked in Objects/enumobject.c trying to spot any differences if start was specified. The only thing that (I believe) differs is when start != NULL in which the following happens:

if (start != NULL) {
    start = PyNumber_Index(start);
    if (start == NULL) {
        Py_DECREF(en);
        return NULL;
    }
    assert(PyInt_Check(start) || PyLong_Check(start));
    en->en_index = PyInt_AsSsize_t(start);
    if (en->en_index == -1 && PyErr_Occurred()) {
        PyErr_Clear();
        en->en_index = PY_SSIZE_T_MAX;
        en->en_longindex = start;
    } else {
        en->en_longindex = NULL;
        Py_DECREF(start);
    }

Which doesn't look like something which would introduce a 2x slowdown. (I think, not sure.)

The previous code segments have been executed on Python 3.5, similar results are present in 2.x too, though.

This is where I'm stuck and can't figure out where to look. This might just be a case of overhead from additional calls in the second case accumulating, but again, I'm not really sure. Does anybody know what might be the reason behind this?

Mazdak · Accepted Answer

One reason might be because of calling the PyNumber_Index while you specify a start in following part :

if (start != NULL) {
    start = PyNumber_Index(start);

And If you take a look at PyNumber_Index function in abstract.c module you'll see the following comment at the top level of function:

/* Return a Python int from the object item.
   Raise TypeError if the result is not an int
   or if the object cannot be interpreted as an index.
*/

So this function has to check if the object cannot be interpreted as an index and will returns the relative errors. And if you look at the source carefully you'll see the all of this checking and referencing, specially in following part which has to do a nested structure dereference in order to checking the index type:

result = item->ob_type->tp_as_number->nb_index(item);
if (result &&
     !PyInt_Check(result) && !PyLong_Check(result)) {
                         ...

Would takes much time to check and return a desire result.

But as @ user2357112 mentioned, another and most important reason is because of the python keyword argument matching.

If you time-it the function without keyword argument you'll see the difference time will decrease approximately ~2X time:

~$ python -m timeit "enumerate([1, 2, 3, 4])"
1000000 loops, best of 3: 0.251 usec per loop
~$ python -m timeit "enumerate([1, 2, 3, 4],start=0)"
1000000 loops, best of 3: 0.431 usec per loop
~$ python -m timeit "enumerate([1, 2, 3, 4],0)"
1000000 loops, best of 3: 0.275 usec per loop

The difference with positional argument is:

>>> 0.251 - 0.275
-0.024

Which seems that is because of PyNumber_Index.

Dimitris Fasarakis Hilliard · Answer

It probably is just is a combination of factors contributing to the overall slowdown.

Keyword args:

When Python sees the CALL_FUNCTION argument it will call call_function as you already pointed out. After going through some if clauses the call issued is x = do_call(func, pp_stack, na, nk);. Notice nk here which holds the total count of keyword arguments (so in the case of enumerate -> kw=1).

In do_call you'll see the following if clause:

if (nk > 0) {
    kwdict = update_keyword_args(NULL, nk, pp_stack, func);
    if (kwdict == NULL)
        goto call_fail;
}

If the number of keyword args is not zero (nk > 0), call update_keyword_args. Now, update_keyword_args does what you would expect, if orig_kwdict is NULL (which it is, look at the call to update_keyword_args) create a new dictionary:

if (orig_kwdict == NULL)
    kwdict = PyDict_New();

and then populate the dictionary with all values present in the value stack:

while (--nk >= 0) {
// copy from stack

These probably contribute significant to the overall delay.

Creating the `enum` object:

You're right about enum_new, if called with enumerate([1, 2, 3, 4], start=0) the variable start inside enum_new will have a value and therefore be != NULL. As a result the if clause will evaluate to True and the code inside it will execute, adding time to the call.

What is performed inside the if clause is not really heavy work, but it does contribute to the overall time required.

Additionally:

you also have the two additional byte code commands to consider, they might just be two but they add to the overall time taken due to the fact that we're timing really quick things (in the range of ns).
Again, insignificant from an overall standpoint, but, parsing a call with kws requires as before, a wee bit more time.

Finally:

I might be missing some stuff but overall these are some of the factors that, in conjunction, create overhead when creating a new enumerate object with start specified.

Why does enumerate execute slower when not specifying the keyword start?

Tags:

python

python-3.x

python-internals

python-2.7

enumerate

Ariadni

2 Answers

Mazdak

Keyword args:

Creating the `enum` object:

Dimitris Fasarakis Hilliard

Recent Activity

Donate For Us

Why does enumerate execute slower when not specifying the keyword start?

Tags:

python

python-3.x

python-internals

python-2.7

enumerate

Ariadni

2 Answers

Mazdak

Keyword args:

Creating the enum object:

Dimitris Fasarakis Hilliard

Related questions

Recent Activity

Donate For Us

Creating the `enum` object: