Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Order of operations in a dictionary comprehension

Tags:

I came across the following interesting construct:

assuming you have a list of lists as follows:

my_list = [['captain1', 'foo1', 'bar1', 'foobar1'], ['captain2', 'foo2', 'bar2', 'foobar2'], ...] 

and you want to create a dict out of them with the 0-index elements being the keys. A handy way to do it would be this:

my_dict = {x.pop(0): x for x in my_list} # {'captain1': ['foo1', 'bar1', 'foobar1'], ...} 

As it seems, the pop precedes the assignment of list x as the value and that is why 'captain' does not appear in the values (it is already popped)

Now let's take this a step further and try to get a structure like:

# {'captain1': {'column1': 'foo1', 'column2': 'bar1', 'column3': 'foobar1'}, ...} 

For this task I wrote the following:

my_headers = ['column1', 'column2', 'column3'] my_dict = {x.pop(0): {k: v for k, v in zip(my_headers, x)} for x in my_list} 

but this returns:

# {'captain1': {'col3': 'bar1', 'col1': 'captain1', 'col2': 'foo1'}, 'captain2': {'col3': 'bar2', 'col1': 'captain2', 'col2': 'foo2'}} 

so the pop in this case happens after the inner dictionary is constructed (or at least after the zip).

How can that be? How does this work?

The question is not about how to do it but rather why this behavior is seen.

I am using Python version 3.5.1.

like image 928
Ma0 Avatar asked Feb 13 '17 10:02

Ma0


People also ask

What is a dictionary comprehension?

Dictionary comprehension is a method for transforming one dictionary into another dictionary. During this transformation, items within the original dictionary can be conditionally included in the new dictionary and each item can be transformed as needed.

What are the basic dictionary operations?

A dictionary is defined as a general-purpose data structure for storing a group of objects. A dictionary is associated with a set of keys and each key has a single associated value. When presented with a key, the dictionary will simply return the associated value.

Is dictionary an ordered sequence?

Dictionaries are insertion ordered as of Python 3.6. It is described as a CPython implementation detail rather than a language feature.

What is the syntax for dictionary comprehension in Python?

Syntax of Dictionary Comprehension Iterable is any python object in which you can loop over. For example, list, tuple or string. It creates dictionary {'a': 1, 'b': 2, 'c': 3} . It can also be written without dictionary comprehension like dict(zip(keys, values)) .


1 Answers

Note: As of Python 3.8 and PEP 572, this was changed and the keys are evaluated first.


tl;dr Until Python 3.7: Even though Python does evaluate values first (the right-side of the expression) this does appear to be a bug in (C)Python according to the reference manual and the grammar and the PEP on dict comprehensions.

Though this was previously fixed for dictionary displays where values were again evaluated before the keys, the patch wasn't amended to include dict-comprehensions. This requirement was also mentioned by one of the core-devs in a mailing list thread discussing this same subject.

According to the reference manual, Python evaluates expressions from left to right and assignments from right to left; a dict-comprehension is really an expression containing expressions, not an assignment*:

{expr1: expr2 for ...} 

where, according to the corresponding rule of the grammar one would expect expr1: expr2 to be evaluated similarly to what it does in displays. So, both expressions should follow the defined order, expr1 should be evaluated before expr2 (and, if expr2 contains expressions of its own, they too should be evaluated from left to right.)

The PEP on dict-comps additionally states that the following should be semantically equivalent:

The semantics of dict comprehensions can actually be demonstrated in stock Python 2.2, by passing a list comprehension to the built-in dictionary constructor:

>>> dict([(i, chr(65+i)) for i in range(4)])

is semantically equivalent to:

>>> {i : chr(65+i) for i in range(4)}

were the tuple (i, chr(65+i)) is evaluated left to right as expected.

Changing this to behave according to the rules for expressions would create an inconsistency in the creation of dicts, of course. Dictionary comprehensions and a for loop with assignments result in a different evaluation order but, that's fine since it is just following the rules.

Though this isn't a major issue it should be fixed (either the rule of evaluation, or the docs) to disambiguate the situation.

*Internally, this does result in an assignment to a dictionary object but, this shouldn't break the behavior expressions should have. Users have expectations about how expressions should behave as stated in the reference manual.


As the other answerers pointed out, since you perform a mutating action in one of the expressions, you toss out any information on what gets evaluated first; using print calls, as Duncan did, sheds light on what is done.

A function to help in showing the discrepancy:

def printer(val):     print(val, end=' ')     return val 

(Fixed) dictionary display:

>>> d = {printer(0): printer(1), printer(2): printer(3)} 0 1 2 3 

(Odd) dictionary comprehension:

>>> t = (0, 1), (2, 3) >>> d = {printer(i):printer(j) for i,j in t} 1 0 3 2 

and yes, this applies specifically for CPython. I am not aware of how other implementations evaluate this specific case (though they should all conform to the Python Reference Manual.)

Digging through the source is always nice (and you also find hidden comments describing the behavior too), so let's peek in compiler_sync_comprehension_generator of the file compile.c:

case COMP_DICTCOMP:     /* With 'd[k] = v', v is evaluated before k, so we do        the same. */     VISIT(c, expr, val);     VISIT(c, expr, elt);     ADDOP_I(c, MAP_ADD, gen_index + 1);     break; 

this might seem like a good enough reason and, if it is judged as such, should be classified as a documentation bug, instead.

On a quick test I did, switching these statements around (VISIT(c, expr, elt); getting visited first) while also switching the corresponding order in MAP_ADD (which is used for dict-comps):

TARGET(MAP_ADD) {     PyObject *value = TOP();   # was key      PyObject *key = SECOND();  # was value     PyObject *map;     int err; 

results in the evaluation one would expect based on the docs, with the key evaluated before the value. (Not for their asynchronous versions, that's another switch required.)


I'll drop a comment on the issue and update when and if someone gets back to me.

Created Issue 29652 -- Fix evaluation order of keys/values in dict comprehensions on the tracker. Will update the question when progress is made on it.

like image 89
Dimitris Fasarakis Hilliard Avatar answered Oct 14 '22 17:10

Dimitris Fasarakis Hilliard