Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are collections not handled uniformly in Python?

Sets and lists are handled differently in Python, and there seems to be no uniform way to work with both. For example, adding an item to a set is done using the add method, and for the list it is done using the append method. I am aware that there are different semantics behind this, but there are also common semantics there, and often an algorithm that works with some collection cares more about the commonalities than the differences. The C++ STL shows that this can work, so why is there no such concept in Python?

Edit: In C++ I can use an output_iterator to store values in an (almost) arbitrary type of collection, including lists and sets. I can write an algorithm that takes such an iterator as argument and writes elements to it. The algorithm then is completely agnostic to the kind of container (or other device, may be a file) that backs the iterator. If the backing container is a set that ignores duplicates, then that is the decision of the caller. My specific problem is, that it has happened several times to me now that I used for instance a list for a certain task and later decided that set is more appropriate. Now I have to change the append to add in several places in my code. I am just wondering why Python has no concept for such cases.

like image 320
Björn Pollex Avatar asked Sep 14 '10 09:09

Björn Pollex


2 Answers

The direct answer: it's a design flaw.

You should be able to insert into any container where generic insertion makes sense (eg. excluding dict) with the same method name. There should be a consistent, generic name for insertion, eg. add, corresponding to set.add and list.append, so you can add to a container without having to care as much about what you're inserting into.

Using different names for this operation in different types is a gratuitous inconsistency, and sets a poor base standard: the library should encourage user containers to use a consistent API, rather than providing largely incompatible APIs for each basic container.

That said, it's not often a practical problem in this case: most of the time where a function's results are a list of items, implement it as a generator. They allow handling both of these consistently (from the perspective of the function), as well as other forms of iteration:

def foo():
    yield 1
    yield 2
    yield 3

s = set(foo())
l = list(foo())
results1 = [i*2 for i in foo()]
results2 = (i*2 for i in foo())
for r in foo():
    print r
like image 140
Glenn Maynard Avatar answered Oct 28 '22 02:10

Glenn Maynard


add and append are different. Sets are unordered and contain unique elements, while append suggest the item is always added, and that this is done specifically at the end.

sets and lists can both be treated as iterables, and that's their common semantics, and that's freely usable by your algorithms.

If you have an algorithm that depends on some sort of addition, you simply can't depend on sets, tuples, lists, dicts, strings behaving the same.

like image 29
Ivo van der Wijk Avatar answered Oct 28 '22 01:10

Ivo van der Wijk