As part of some WSGI middleware I want to write a python class that wraps an iterator to implement a close method on the iterator.
This works fine when I try it with an old-style class, but throws a TypeError when I try it with a new-style class. What do I need to do to get this working with a new-style class?
Example:
class IteratorWrapper1:
def __init__(self, otheriter):
self._iterator = otheriter
self.next = otheriter.next
def __iter__(self):
return self
def close(self):
if getattr(self._iterator, 'close', None) is not None:
self._iterator.close()
# other arbitrary resource cleanup code here
class IteratorWrapper2(object):
def __init__(self, otheriter):
self._iterator = otheriter
self.next = otheriter.next
def __iter__(self):
return self
def close(self):
if getattr(self._iterator, 'close', None) is not None:
self._iterator.close()
# other arbitrary resource cleanup code here
if __name__ == "__main__":
for i in IteratorWrapper1(iter([1, 2, 3])):
print i
for j in IteratorWrapper2(iter([1, 2, 3])):
print j
Gives the following output:
1
2
3
Traceback (most recent call last):
...
TypeError: iter() returned non-iterator of type 'IteratorWrapper2'
The __next__() method must return the next item in the sequence. On reaching the end, and in subsequent calls, it must raise StopIteration . Here, we show an example that will give us the next power of 2 in each iteration.
The __iter__() function returns an iterator object that goes through each element of the given object. The next element can be accessed through __next__() function. In the case of callable object and sentinel value, the iteration is done until the value is found or the end of elements reached.
Iterator in Python is an object that is used to iterate over iterable objects like lists, tuples, dicts, and sets. The iterator object is initialized using the iter() method. It uses the next() method for iteration. __next__(): The next method returns the next value for the iterable.
You can create an iterator object by implementing the iter built-in function to an iterable. An iterator can be used to manually loop over the items in the iterable. The repeated passing of the iterator to the built-in next function returns successive items in the stream.
What you're trying to do makes sense, but there's something evil going on inside Python here.
class foo(object):
c = 0
def __init__(self):
self.next = self.next2
def __iter__(self):
return self
def next(self):
if self.c == 5: raise StopIteration
self.c += 1
return 1
def next2(self):
if self.c == 5: raise StopIteration
self.c += 1
return 2
it = iter(foo())
# Outputs: <bound method foo.next2 of <__main__.foo object at 0xb7d5030c>>
print it.next
# 2
print it.next()
# 1?!
for x in it:
print x
foo() is an iterator which modifies its next method on the fly--perfectly legal anywhere else in Python. The iterator we create, it, has the method we expect: it.next is next2. When we use the iterator directly, by calling next(), we get 2. Yet, when we use it in a for loop, we get the original next, which we've clearly overwritten.
I'm not familiar with Python internals, but it seems like an object's "next" method is being cached in tp_iternext
(http://docs.python.org/c-api/typeobj.html#tp_iternext), and then it's not updated when the class is changed.
This is definitely a Python bug. Maybe this is described in the generator PEPs, but it's not in the core Python documentation, and it's completely inconsistent with normal Python behavior.
You could work around this by keeping the original next function, and wrapping it explicitly:
class IteratorWrapper2(object):
def __init__(self, otheriter):
self.wrapped_iter_next = otheriter.next
def __iter__(self):
return self
def next(self):
return self.wrapped_iter_next()
for j in IteratorWrapper2(iter([1, 2, 3])):
print j
... but that's obviously less efficient, and you should not have to do that.
There are a bunch of places where CPython take surprising shortcuts based on class properties instead of instance properties. This is one of those places.
Here is a simple example that demonstrates the issue:
def DynamicNext(object):
def __init__(self):
self.next = lambda: 42
And here's what happens:
>>> instance = DynamicNext() >>> next(instance) … TypeError: DynamicNext object is not an iterator >>>
Now, digging into the CPython source code (from 2.7.2), here's the implementation of the next()
builtin:
static PyObject *
builtin_next(PyObject *self, PyObject *args)
{
…
if (!PyIter_Check(it)) {
PyErr_Format(PyExc_TypeError,
"%.200s object is not an iterator",
it->ob_type->tp_name);
return NULL;
}
…
}
And here's the implementation of PyIter_Check:
#define PyIter_Check(obj) \
(PyType_HasFeature((obj)->ob_type, Py_TPFLAGS_HAVE_ITER) && \
(obj)->ob_type->tp_iternext != NULL && \
(obj)->ob_type->tp_iternext != &_PyObject_NextNotImplemented)
The first line, PyType_HasFeature(…)
, is, after expanding all the constants and macros and stuff, equivalent to DynamicNext.__class__.__flags__ & 1L<<17 != 0
:
>>> instance.__class__.__flags__ & 1L<<17 != 0 True
So that check obviously isn't failing… Which must mean that the next check — (obj)->ob_type->tp_iternext != NULL
— is failing.
In Python, this line is roughly (roughly!) equivalent to hasattr(type(instance), "next")
:
>>> type(instance) __main__.DynamicNext >>> hasattr(type(instance), "next") False
Which obviously fails because the DynamicNext
type doesn't have a next
method — only instances of that type do.
Now, my CPython foo is weak, so I'm going to have to start making some educated guesses here… But I believe they are accurate.
When a CPython type is created (that is, when the interpreter first evaluates the class
block and the class' metaclass' __new__
method is called), the values on the type's PyTypeObject
struct are initialized… So if, when the DynamicNext
type is created, no next
method exists, the tp_iternext
, field will be set to NULL
, causing PyIter_Check
to return false.
Now, as the Glenn points out, this is almost certainly a bug in CPython… Especially given that correcting it would only impact performance when either the object being tested isn't iterable or dynamically assigns a next
method (very approximately):
#define PyIter_Check(obj) \
(((PyType_HasFeature((obj)->ob_type, Py_TPFLAGS_HAVE_ITER) && \
(obj)->ob_type->tp_iternext != NULL && \
(obj)->ob_type->tp_iternext != &_PyObject_NextNotImplemented)) || \
(PyObject_HasAttrString((obj), "next") && \
PyCallable_Check(PyObject_GetAttrString((obj), "next"))))
Edit: after a little bit of digging, the fix would not be this simple, because at least some portions of the code assume that, if PyIter_Check(it)
returns true
, then *it->ob_type->tp_iternext
will exist… Which isn't necessarily the case (ie, because the next
function exists on the instance, not the type).
SO! That's why surprising things happen when you try to iterate over a new-style instance with a dynamically assigned next
method.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With