Python iterators – how to dynamically assign self.next within a new style class?

Tags:

2 Answers

What you're trying to do makes sense, but there's something evil going on inside Python here.

class foo(object):
    c = 0
    def __init__(self):
        self.next = self.next2

    def __iter__(self):
        return self

    def next(self):
        if self.c == 5: raise StopIteration
        self.c += 1
        return 1

    def next2(self):
        if self.c == 5: raise StopIteration
        self.c += 1
        return 2

it = iter(foo())
# Outputs: <bound method foo.next2 of <__main__.foo object at 0xb7d5030c>>
print it.next
# 2
print it.next()
# 1?!
for x in it:
    print x

foo() is an iterator which modifies its next method on the fly--perfectly legal anywhere else in Python. The iterator we create, it, has the method we expect: it.next is next2. When we use the iterator directly, by calling next(), we get 2. Yet, when we use it in a for loop, we get the original next, which we've clearly overwritten.

I'm not familiar with Python internals, but it seems like an object's "next" method is being cached in tp_iternext (http://docs.python.org/c-api/typeobj.html#tp_iternext), and then it's not updated when the class is changed.

This is definitely a Python bug. Maybe this is described in the generator PEPs, but it's not in the core Python documentation, and it's completely inconsistent with normal Python behavior.

You could work around this by keeping the original next function, and wrapping it explicitly:

class IteratorWrapper2(object):
    def __init__(self, otheriter):
        self.wrapped_iter_next = otheriter.next
    def __iter__(self):
        return self
    def next(self):
        return self.wrapped_iter_next()

for j in IteratorWrapper2(iter([1, 2, 3])):
    print j

... but that's obviously less efficient, and you should not have to do that.

139

answered Nov 15 '22 17:11

Glenn Maynard

There are a bunch of places where CPython take surprising shortcuts based on class properties instead of instance properties. This is one of those places.

Here is a simple example that demonstrates the issue:

def DynamicNext(object):
    def __init__(self):
        self.next = lambda: 42

And here's what happens:

>>> instance = DynamicNext()
>>> next(instance)
…
TypeError: DynamicNext object is not an iterator
>>>

Now, digging into the CPython source code (from 2.7.2), here's the implementation of the next() builtin:

static PyObject *
builtin_next(PyObject *self, PyObject *args)
{
    …
    if (!PyIter_Check(it)) {
        PyErr_Format(PyExc_TypeError,
            "%.200s object is not an iterator",
            it->ob_type->tp_name);
        return NULL;
    }
    …
}

And here's the implementation of PyIter_Check:

#define PyIter_Check(obj) \
    (PyType_HasFeature((obj)->ob_type, Py_TPFLAGS_HAVE_ITER) && \
     (obj)->ob_type->tp_iternext != NULL && \
     (obj)->ob_type->tp_iternext != &_PyObject_NextNotImplemented)

The first line, PyType_HasFeature(…), is, after expanding all the constants and macros and stuff, equivalent to DynamicNext.__class__.__flags__ & 1L<<17 != 0:

>>> instance.__class__.__flags__ & 1L<<17 != 0
True

So that check obviously isn't failing… Which must mean that the next check — (obj)->ob_type->tp_iternext != NULL — is failing.

In Python, this line is roughly (roughly!) equivalent to hasattr(type(instance), "next"):

>>> type(instance)
__main__.DynamicNext
>>> hasattr(type(instance), "next")
False

Which obviously fails because the DynamicNext type doesn't have a next method — only instances of that type do.

Now, my CPython foo is weak, so I'm going to have to start making some educated guesses here… But I believe they are accurate.

When a CPython type is created (that is, when the interpreter first evaluates the class block and the class' metaclass' __new__ method is called), the values on the type's PyTypeObject struct are initialized… So if, when the DynamicNext type is created, no next method exists, the tp_iternext, field will be set to NULL, causing PyIter_Check to return false.

Now, as the Glenn points out, this is almost certainly a bug in CPython… Especially given that correcting it would only impact performance when either the object being tested isn't iterable or dynamically assigns a next method (very approximately):

#define PyIter_Check(obj) \
    (((PyType_HasFeature((obj)->ob_type, Py_TPFLAGS_HAVE_ITER) && \
       (obj)->ob_type->tp_iternext != NULL && \
       (obj)->ob_type->tp_iternext != &_PyObject_NextNotImplemented)) || \
      (PyObject_HasAttrString((obj), "next") && \
       PyCallable_Check(PyObject_GetAttrString((obj), "next"))))

Edit: after a little bit of digging, the fix would not be this simple, because at least some portions of the code assume that, if PyIter_Check(it) returns true, then *it->ob_type->tp_iternext will exist… Which isn't necessarily the case (ie, because the next function exists on the instance, not the type).

SO! That's why surprising things happen when you try to iterate over a new-style instance with a dynamically assigned next method.

answered Nov 15 '22 17:11

David Wolever

Related questions
                            
                                How can I count the number of consecutive TRUEs in a DataFrame?
                            
                                Matplotlib and :RuntimeError: main thread is not in main loop:
                            
                                How to print binary numbers using f"" string instead of .format()?
                            
                                Jupyter Notebook - ModuleNotFoundError [duplicate]
                            
                                python aiohttp into existing event loop
                            
                                Pandas dataframe: how do I split one row into multiple rows by multi-value column? [duplicate]
                            
                                Python: TypeError: <lambda>() takes 0 positional arguments but 1 was given due to assert
                            
                                How to install Python 3.6 on Ubuntu 19.04?
                            
                                How to create all django apps inside a folder?
                            
                                Activating conda environment during gitlab CI
                            
                                How can I run selected lines in Spyder 4?
                            
                                Visual Studio Code's debugger & pipenv
                            
                                How to test a FastAPI api endpoint that consumes images?
                            
                                Why does using "==" return a Series instead of bool in pandas?
                            
                                Python vs Julia speed comparison
                            
                                Can I remove pipenv cache folder? How to safely do it
                            
                                ASGI 'lifespan' protocol appears unsupported
                            
                                Can someone explain to my why my django admin theme is dark?
                            
                                Email integration
                            
                                Python *.py, *.pyo, *.pyc: Which can be eliminated for an Embedded System?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python iterators – how to dynamically assign self.next within a new style class?

Tags:

python

iterator

ollyc

People also ask

2 Answers

Glenn Maynard

David Wolever

Recent Activity

Donate For Us