Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python 3.0 - dict methods return views - why?

dict methods dict.keys(), dict.items() and dict.values() return “views” instead of lists.

Source

First of all, how is a view different from an iterator? Secondly, what is the benefit of this change? Is it just for performance reasons?

It doesn't seem intuitive to me, i.e., I'm asking for a list of things (give me all your keys) and I'm getting something else back. Will this confuse people?

like image 332
Greg Avatar asked Dec 04 '08 14:12

Greg


2 Answers

You are effectively getting a list. It's just not a copy of the internal list, but something that acts as if it where a list but only represents the internal state.

That's the same way it's implemented in Java (and probably many other languages/environments as well).

The main reason is that for many use cases returning a completely detached list is unnecessary and wasteful. It would require copying the entire content (which may or many not be a lot).

If you simply want to iterate over the keys then creating a new list is not necessary. And if you indeed need it as a separate list (as a copy) then you can easily create that list from the view.

like image 59
Joachim Sauer Avatar answered Nov 15 '22 20:11

Joachim Sauer


Joachim Sauer's answer explains very well why a list is not returned. But this leaves the question why these functions wouldn't return iterators, just like iteritems etc. did in Python 2.

An iterator is much more restrictive than a container. For example, an iterator does not allow more than one pass; if you try a second pass, you'll find it's empty. Therefore, operations such as elem in cont are supported by containers, but cannot be supported by iterators: once you check whether an element is "in" the iterator, the iterator gets destroyed!

On the other hand, getting a container usually requires making a copy such as creating a list out of the dictionary's keys.

The view object has the best of both worlds: it behaves as a container, and yet does not make a copy of the dictionary! It is, in fact, a kind of a virtual read-only container that works by linking to the underlying dictionary. I don't know if it's seen anywhere else in the standard Python.

Edit:

@AntonyHatchkins: the reason it doesn't return a generator function is that it wouldn't allow for a fast in operation. Yes, in works for generator functions (when you call them). That is, you can do this:

def f():
  for i in range(10):
    yield i

5 in f() # True

But according to the definition of in, if the right side is a generator, python will go through all the n items of the generator - leading to O(n) time complexity. There's nothing you can do about it because that's the only meaningful behavior an arbitrary generator.

On the other hand, in the case of the dictionary view, you can implement in any way you like, because you know more about the data you manage. And in fact in is implemented with O(1) complexity using a hash table. You can check it by running

>>> d = dict(zip(range(50000000), range(50000000)))
>>> 49999999 in d
True
>>> 49999999 in iter(d) # kinda how generator function would work
True
>>>

and noticing how fast the first in is compared to the second in.

like image 30
max Avatar answered Nov 15 '22 20:11

max