class Foo: def __getitem__(self, item): print('getitem', item) if item == 6: raise IndexError return item**2 def __len__(self): print('len') return 3 class Bar: def __iter__(self): print('iter') return iter([3, 5, 42, 69]) def __len__(self): print('len') return 3
Demo:
>>> list(Foo()) len getitem 0 getitem 1 getitem 2 getitem 3 getitem 4 getitem 5 getitem 6 [0, 1, 4, 9, 16, 25] >>> list(Bar()) iter len [3, 5, 42, 69]
Why does list
call __len__
? It doesn't seem to use the result for anything obvious. A for
loop doesn't do it. This isn't mentioned anywhere in the iterator protocol, which just talks about __iter__
and __next__
.
Is this Python reserving space for the list in advance, or something clever like that?
(CPython 3.6.0 on Linux)
Python len() The len() function returns the number of items (length) in an object.
len(list) returns the length of the list. Everytime you call it, it will return the length of the list as it currently is.
Well, len(s) is a built-in Python method which returns the length of an object. Now __len__() is a special method that is internally called by len(s) method to return the length of an object.
The function len() is one of Python's built-in functions. It returns the length of an object. For example, it can return the number of items in a list.
See the Rationale section from PEP 424 that introduced __length_hint__
and offers insight on the motivation:
Being able to pre-allocate lists based on the expected size, as estimated by
__length_hint__
, can be a significant optimization. CPython has been observed to run some code faster than PyPy, purely because of this optimization being present.
In addition to that, the documentation for object.__length_hint__
verifies the fact that this is purely an optimization feature:
Called to implement
operator.length_hint()
. Should return an estimated length for the object (which may be greater or less than the actual length). The length must be an integer>= 0
. This method is purely an optimization and is never required for correctness.
So __length_hint__
is here because it can result in some nice optimizations.
PyObject_LengthHint
, first tries to get a value from object.__len__
(if it is defined) and then tries to see if object.__length_hint__
is available. If neither is there, it returns a default value of 8
for lists.
listextend
, which is called from list_init
as Eli stated in his answer, was modified according to this PEP to offer this optimization for anything that defines either a __len__
or a __length_hint__
.
list
isn't the only one that benefits from this, of course, bytes
objects do:
>>> bytes(Foo()) len getitem 0 ... b'\x00\x01\x04\t\x10\x19'
so do bytearray
objects but, only when you extend
them:
>>> bytearray().extend(Foo()) len getitem 0 ...
and tuple
objects which create an intermediary sequence to populate themselves:
>>> tuple(Foo()) len getitem 0 ... (0, 1, 4, 9, 16, 25)
If anybody is wandering why exactly 'iter'
is printed before 'len'
in class Bar
and not after as happens with class Foo
:
This is because if the object in hand defines an __iter__
Python will first call it to get the iterator, thereby running the print('iter')
too. The same doesn't happen if it falls back to using __getitem__
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With