In python 2, the built in function map
seems to call the __len__
when length is overwritten. Is that correct - if so, why are we computing the length of the iterable to map? Iterables don't need to have length overwritten (e.g.), and the map function works even when length is not pre-defined by the iterable.
Map is defined here; it does specify that there is length-dependent functionality in the event that multiple iterables are passed. However,
None
I am concerned because according to several 1 2 extremely highly upvoted questions,
map(f, iterable)
is basically equivalent to:
[f(x) for x in iterable]
But I am running into simple examples where that isn't true.
For Example
class Iterable:
def __iter__(self):
self.iterable = [1,2,3,4,5].__iter__()
return self
def next(self):
return self.iterable.next()
#def __len__(self):
# self.iterable = None
# return 5
def foo(x): return x
print( [foo(x) for x in Iterable()] )
print( map(foo,Iterable()) )
Behaves as it should, but if you uncomment the overloading of len
, it very much does not.
In this case, it raises an AttributeError because the iterable is None
. While the unit behaviour is silly, I see no requirement of invariance in the specification of len. Surely, it's good practice to not modify the state in a call to len
, but the reason should not be because of unexpectable behaviour in builtin functions. In more realistic cases, my len
function may just be slow, and I don't expect to worry about it being called by map
, or maybe it isn't thread safe, etc..
Implementation Dependent?
Since map
is a builtin function, it may have implementation-specific features outside the spec, but cpython implements it on line 918 of bltinmodule.c, which indeed states:
/* Do a first pass to obtain iterators for the arguments, and set len * to the largest of their lengths. */
And then calls _PyObject_LengthHint
, which is defined in Object/abstract.c, and indeed seems to look for an overwritten len
. This doesn't clarify to me whether this is just implementation dependent, or if I'm missing some reason that map
purposefully looks for the iterable's length against my instinct.
(Note I haven't tested this in python 3, that is why I specified python 2. In python3, map returns a generator, so at least a few of my claims aren't true)
List comprehensions were added with Python 2.0. Essentially, it is Python's way of implementing a well-known notation for sets as used by mathematicians. In mathematics the square numbers of the natural numbers are, for example, created by { x2 | x ∈ ℕ } or the set of complex integers { (x,y) | x ∈ ℤ ∧ y ∈ ℤ }.
Map function is faster than list comprehension when the formula is already defined as a function earlier. So, that map function is used without lambda expression.
map() works way faster than for loop. Considering the same code above when run in this ide.
For loops are faster than list comprehensions to run functions.
I am concerned because according to several 1 2 extremely highly upvoted questions,
map(f, iterable)
is basically equivalent to:
[f(x) for x in iterable]
But I am running into simple examples where that isn't true.
But calling _PyObject_LengthHint
is supposed to be basically equivalent to not calling it. An object's __len__
or __length_hint__
is not supposed to mutate the object like this. You might as well say that map(f, iterable)
and [f(x) for x in iterable]
are inequivalent because if f
uses stack inspection to determine whether it's being called from map
and does something different, the two snippets behave differently.
As for why map
does this, it's trying preallocate the list to the right size to avoid needing to resize the list. Resizes only slow things down by a constant factor, but if you can avoid the constant factor, why not? It would be perfectly reasonable for list comprehensions to do this in a future Python version.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With