I have a class with both an <code>__iter__</code> and a <code>__len__</code> methods. The latter uses the former to count all elements. It works like the following: <pre class="prettyprint"><code>class A: def __iter__(self): print("iter") for _ in range(5): yield "something" def __len__(self): print("len") n = 0 for _ in self: n += 1 return n </code></pre> Now if we take e.g. the length of an instance it prints <code>len</code> and <code>iter</code>, as expected: <pre class="prettyprint"><code>>>> len(A()) len iter 5 </code></pre> But if we call <code>list()</code> it calls both <code>__iter__</code> and <code>__len__</code>: <pre class="prettyprint"><code>>>> list(A()) len iter iter ['something', 'something', 'something', 'something', 'something'] </code></pre> It works as expected if we make a generator expression: <pre class="prettyprint"><code>>>> list(x for x in A()) iter ['something', 'something', 'something', 'something', 'something'] </code></pre> I would assume <code>list(A())</code> and <code>list(x for x in A())</code> to work the same but they don’t. Note that it appears to first call <code>__iter__</code>, then <code>__len__</code>, then loop over the iterator: <pre class="prettyprint"><code>class B: def __iter__(self): print("iter") def gen(): print("gen") yield "something" return gen() def __len__(self): print("len") return 1 print(list(B())) </code></pre> Output: <pre class="prettyprint"><code>iter len gen ['something'] </code></pre> <hr> How can I get <code>list()</code> not to call <code>__len__</code> so that my instance’s iterator is not consumed twice? I could define e.g. a <code>length</code> or <code>size</code> method and one would then call <code>A().size()</code> but that’s less pythonic. I tried to compute the length in <code>__iter__</code> and cache it so that subsequent calls to <code>__len__</code> don’t need to iter again but <code>list()</code> calls <code>__len__</code> without starting to iterate so it doesn’t work. Note that in my case I work on very large data collections so caching all items is not an option.

It's a safe bet that the <code>list()</code> constructor is detecting that <code>len()</code> is available and calling it in order to pre-allocate storage for the list. Your implementation is pretty much completely backwards. You are implementing <code>__len__()</code> by using <code>__iter__()</code>, which is not what Python expects. The expectation is that <code>len()</code> is a fast, efficient way to determine the length in advance. I don't think you can convince <code>list(A())</code> not to call <code>len</code>. As you have already observed, you can create an intermediate step that prevents <code>len</code> from being called. You should definitely cache the result, if the sequence is immutable. If there are as many items as you speculate, there's no sense computing <code>len</code> more than once.

How to have list() consume iter without calling len?

Tags:

python

I have a class with both an __iter__ and a __len__ methods. The latter uses the former to count all elements.

It works like the following:

class A:
    def __iter__(self):
        print("iter")
        for _ in range(5):
            yield "something"

    def __len__(self):
        print("len")
        n = 0
        for _ in self:
            n += 1
        return n

Now if we take e.g. the length of an instance it prints len and iter, as expected:

>>> len(A())
len
iter
5

But if we call list() it calls both __iter__ and __len__:

>>> list(A())
len
iter
iter
['something', 'something', 'something', 'something', 'something']

It works as expected if we make a generator expression:

>>> list(x for x in A())
iter
['something', 'something', 'something', 'something', 'something']

I would assume list(A()) and list(x for x in A()) to work the same but they don’t.

Note that it appears to first call __iter__, then __len__, then loop over the iterator:

class B:
    def __iter__(self):
        print("iter")

        def gen():
            print("gen")
            yield "something"

        return gen()

    def __len__(self):
        print("len")
        return 1

print(list(B()))

Output:

iter
len
gen
['something']

How can I get list() not to call __len__ so that my instance’s iterator is not consumed twice? I could define e.g. a length or size method and one would then call A().size() but that’s less pythonic.

I tried to compute the length in __iter__ and cache it so that subsequent calls to __len__ don’t need to iter again but list() calls __len__ without starting to iterate so it doesn’t work.

Note that in my case I work on very large data collections so caching all items is not an option.

886

asked May 12 '16 14:05

bfontaine

1 Answers

It's a safe bet that the list() constructor is detecting that len() is available and calling it in order to pre-allocate storage for the list.

Your implementation is pretty much completely backwards. You are implementing __len__() by using __iter__(), which is not what Python expects. The expectation is that len() is a fast, efficient way to determine the length in advance.

I don't think you can convince list(A()) not to call len. As you have already observed, you can create an intermediate step that prevents len from being called.

You should definitely cache the result, if the sequence is immutable. If there are as many items as you speculate, there's no sense computing len more than once.

answered Oct 14 '22 11:10

aghast

Related questions
                            
                                Is there a way to set metaclass after the class definition?
                            
                                Checking if property is settable/deletable
                            
                                efficient function to retrieve a queryset of ancestors of an mptt queryset
                            
                                Python equivalent of unix cksum function
                            
                                Python AppIndicator bindings -> howto check if the menu is open?
                            
                                Why does pickle protocol 2 let me serialise an open file object?
                            
                                Python performance vs PHP [closed]
                            
                                Making custom containers work with **kwargs (how does Python expand the args?)
                            
                                Cartesian product of large iterators (itertools)
                            
                                Learning and using augmented Bayes classifiers in python
                            
                                Python tools for out-of-core computation/data mining
                            
                                How to duplicate rows in pandas, based on items in a list [duplicate]
                            
                                Pandas DataFrame.unstack() Changes Order of Row and Column Headers
                            
                                How Do I Queue My Python Locks?
                            
                                Processing an image of a table to get data from it
                            
                                Using Python+Theano with OpenCL in an AMD GPU
                            
                                How to package a linked DLL and a pyd file into one self contained pyd file?
                            
                                What to do with missing values when plotting with seaborn?
                            
                                How to slice a list of tuples in python?
                            
                                Tensorflow: How to modify the value in tensor

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to have list() consume iter without calling len?

Tags:

python

bfontaine

People also ask

1 Answers

aghast

Recent Activity

Donate For Us

How to have list() consume __iter__ without calling __len__?

Tags:

python

bfontaine

People also ask

1 Answers

aghast

Related questions

Recent Activity

Donate For Us

How to have list() consume iter without calling len?