In python 2, the built in function <code>map</code> seems to call the <code>__len__</code> when length is overwritten. Is that correct - if so, why are we computing the length of the iterable to map? Iterables don't need to have length overwritten (e.g.), and the map function works even when length is not pre-defined by the iterable. Map is defined here; it does specify that there is length-dependent functionality in the event that multiple iterables are passed. However, <ul> <li>I'm interested in the case that only one iterable is passed </li> <li>Even if multiple iterables were passed (not my question), it seems like an odd design choice to explicitely check the length, instead of just iterating until you run out and then returning <code>None</code> </li> </ul> I am concerned because according to several 1 2 extremely highly upvoted questions, <blockquote> <code>map(f, iterable)</code> is basically equivalent to: <code>[f(x) for x in iterable]</code> </blockquote> But I am running into simple examples where that isn't true. For Example <pre class="prettyprint"><code>class Iterable: def __iter__(self): self.iterable = [1,2,3,4,5].__iter__() return self def next(self): return self.iterable.next() #def __len__(self): # self.iterable = None # return 5 def foo(x): return x print( [foo(x) for x in Iterable()] ) print( map(foo,Iterable()) ) </code></pre> Behaves as it should, but if you uncomment the overloading of <code>len</code>, it very much does not. In this case, it raises an AttributeError because the iterable is <code>None</code>. While the unit behaviour is silly, I see no requirement of invariance in the specification of len. Surely, it's good practice to not modify the state in a call to <code>len</code>, but the reason should not be because of unexpectable behaviour in builtin functions. In more realistic cases, my <code>len</code> function may just be slow, and I don't expect to worry about it being called by <code>map</code>, or maybe it isn't thread safe, etc.. <hr> Implementation Dependent? Since <code>map</code> is a builtin function, it may have implementation-specific features outside the spec, but cpython implements it on line 918 of bltinmodule.c, which indeed states: <blockquote> <pre class="prettyprint"><code>/* Do a first pass to obtain iterators for the arguments, and set len * to the largest of their lengths. */ </code></pre> </blockquote> And then calls <code>_PyObject_LengthHint</code>, which is defined in Object/abstract.c, and indeed seems to look for an overwritten <code>len</code>. This doesn't clarify to me whether this is just implementation dependent, or if I'm missing some reason that <code>map</code> purposefully looks for the iterable's length against my instinct. <hr> (Note I haven't tested this in python 3, that is why I specified python 2. In python3, map returns a generator, so at least a few of my claims aren't true)

<blockquote> I am concerned because according to several 1 2 extremely highly upvoted questions, <blockquote> <code>map(f, iterable)</code> is basically equivalent to: <code>[f(x) for x in iterable]</code> </blockquote> But I am running into simple examples where that isn't true. </blockquote> But calling <code>_PyObject_LengthHint</code> is supposed to be basically equivalent to not calling it. An object's <code>__len__</code> or <code>__length_hint__</code> is not supposed to mutate the object like this. You might as well say that <code>map(f, iterable)</code> and <code>[f(x) for x in iterable]</code> are inequivalent because if <code>f</code> uses stack inspection to determine whether it's being called from <code>map</code> and does something different, the two snippets behave differently. As for why <code>map</code> does this, it's trying preallocate the list to the right size to avoid needing to resize the list. Resizes only slow things down by a constant factor, but if you can avoid the constant factor, why not? It would be perfectly reasonable for list comprehensions to do this in a future Python version.

Python 2, map not equivalent to list comprehension in simple case; length dependent

Tags:

python

python-2.x

In python 2, the built in function map seems to call the __len__ when length is overwritten. Is that correct - if so, why are we computing the length of the iterable to map? Iterables don't need to have length overwritten (e.g.), and the map function works even when length is not pre-defined by the iterable.

Map is defined here; it does specify that there is length-dependent functionality in the event that multiple iterables are passed. However,

I'm interested in the case that only one iterable is passed
Even if multiple iterables were passed (not my question), it seems like an odd design choice to explicitely check the length, instead of just iterating until you run out and then returning None

I am concerned because according to several 1 2 extremely highly upvoted questions,

map(f, iterable)

is basically equivalent to:

[f(x) for x in iterable]

But I am running into simple examples where that isn't true.

For Example

class Iterable:

    def __iter__(self):
        self.iterable = [1,2,3,4,5].__iter__()
        return self

    def next(self):
        return self.iterable.next()

   #def __len__(self):
   #     self.iterable = None
   #    return 5


def foo(x): return x

print( [foo(x) for x in Iterable()] )
print( map(foo,Iterable()) )

Behaves as it should, but if you uncomment the overloading of len, it very much does not.

In this case, it raises an AttributeError because the iterable is None. While the unit behaviour is silly, I see no requirement of invariance in the specification of len. Surely, it's good practice to not modify the state in a call to len, but the reason should not be because of unexpectable behaviour in builtin functions. In more realistic cases, my len function may just be slow, and I don't expect to worry about it being called by map, or maybe it isn't thread safe, etc..

Implementation Dependent?

Since map is a builtin function, it may have implementation-specific features outside the spec, but cpython implements it on line 918 of bltinmodule.c, which indeed states:

/* Do a first pass to obtain iterators for the arguments, and set len
 * to the largest of their lengths.
 */

And then calls _PyObject_LengthHint, which is defined in Object/abstract.c, and indeed seems to look for an overwritten len. This doesn't clarify to me whether this is just implementation dependent, or if I'm missing some reason that map purposefully looks for the iterable's length against my instinct.

(Note I haven't tested this in python 3, that is why I specified python 2. In python3, map returns a generator, so at least a few of my claims aren't true)

991

asked Dec 02 '15 19:12

en_Knight

Video Answer

1 Answers

I am concerned because according to several 1 2 extremely highly upvoted questions,

map(f, iterable)

is basically equivalent to:

[f(x) for x in iterable]

But I am running into simple examples where that isn't true.

But calling _PyObject_LengthHint is supposed to be basically equivalent to not calling it. An object's __len__ or __length_hint__ is not supposed to mutate the object like this. You might as well say that map(f, iterable) and [f(x) for x in iterable] are inequivalent because if f uses stack inspection to determine whether it's being called from map and does something different, the two snippets behave differently.

As for why map does this, it's trying preallocate the list to the right size to avoid needing to resize the list. Resizes only slow things down by a constant factor, but if you can avoid the constant factor, why not? It would be perfectly reasonable for list comprehensions to do this in a future Python version.

answered Oct 22 '22 14:10

user2357112 supports Monica

Related questions
                            
                                Pandas efficient check if column contains string in other column
                            
                                Python how to scan string and and use upper() between two characters?
                            
                                Scrape just the text, within an html element that has a class, using beautiful soup
                            
                                tornado + momoko doesn't handle connection
                            
                                How to check subsequence exists in a list? [duplicate]
                            
                                jquery month picker: setting an initial min/max range conflicts with "from" <"to" function
                            
                                Is there any kind of subspace clustering package available in scikit-learn
                            
                                what PID does os.killpg look for
                            
                                Error in scipy sparse diags matrix construction
                            
                                How to properly use runcmd and scripts-user in cloud-init?
                            
                                Hidden fields in Django template
                            
                                Bad Request from Yelp API
                            
                                Adding to lists in Python 2.7
                            
                                PyAudio cannot use microphone on Ubuntu 14.04 with 'unable to open slave'
                            
                                Scapy: Using a PacketListField to dissect multiple packets contained in a packet
                            
                                How can I start a process and put it to background in python?
                            
                                How to update artists in scrollable, matplotlib and multiplot
                            
                                Lasagne/nolearn autoencoder - how to get hidden layer output?
                            
                                Caching a computed value as a constant in TensorFlow
                            
                                Polling the output from airodump-ng in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With