Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In-place custom object unpacking different behavior with __getitem__ python 3.5 vs python 3.6

a follow-up question on this question: i ran the code below on python 3.5 and python 3.6 - with very different results:

class Container:

    KEYS = ('a', 'b', 'c')

    def __init__(self, a=None, b=None, c=None):
        self.a = a
        self.b = b
        self.c = c

    def keys(self):
        return Container.KEYS

    def __getitem__(self, key):
        if key not in Container.KEYS:
            raise KeyError(key)
        return getattr(self, key)

    def __str__(self):
        # python 3.6
        # return f'{self.__class__.__name__}(a={self.a}, b={self.b}, c={self.c})'
        # python 3.5    
        return ('{self.__class__.__name__}(a={self.a}, b={self.b}, '
                'c={self.c})').format(self=self)

data0 = Container(a=1, b=2, c=3)
print(data0)

data3 = Container(**data0, b=7)
print(data3)

as stated in the previous question this raises

TypeError: type object got multiple values for keyword argument 'b'

on python 3.6. but on python 3.5 i get the exception:

KeyError: 0

moreover if i do not raise KeyError but just print out the key and return in __getitem__:

def __getitem__(self, key):
    if key not in Container.KEYS:
        # raise KeyError(key)
        print(key)
        return
    return getattr(self, key)

this will print out the int sequence 0, 1, 2, 3, 4, .... (python 3.5)

so my questions are:

  • what has changed between the releases that makes this behave so differently?

  • where are these integers coming from?


UPDATE : as mentioned in the comment by λuser: implementing __iter__ will change the behavior on python 3.5 to match what python 3.6 does:

def __iter__(self):
    return iter(Container.KEYS)
like image 424
hiro protagonist Avatar asked May 17 '18 06:05

hiro protagonist


1 Answers

This is actually a complicated conflict between multiple internal operations during unpacking a custom mapping object and creating the caller's arguments. Therefore, if you wan to understand the underlying reasons thoroughly I'd suggest you to look into the source code. However, here are some hints and starting points that you can look into for greater details.

Internally, when you unpack at a caller level, the byte code BUILD_MAP_UNPACK_WITH_CALL(count) pops count mappings from the stack, merges them into a single dictionary and pushes the result. In other hand, the stack effect of this opcode with argument oparg is defined as following:

case BUILD_MAP_UNPACK_WITH_CALL:
    return 1 - oparg;

With that being said lets look at the byte codes of an example (in Python-3.5) to see this in action:

>>> def bar(data0):foo(**data0, b=4)
... 
>>> 
>>> dis.dis(bar)
  1           0 LOAD_GLOBAL              0 (foo)
              3 LOAD_FAST                0 (data0)
              6 LOAD_CONST               1 ('b')
              9 LOAD_CONST               2 (4)
             12 BUILD_MAP                1
             15 BUILD_MAP_UNPACK_WITH_CALL   258
             18 CALL_FUNCTION_KW         0 (0 positional, 0 keyword pair)
             21 POP_TOP
             22 LOAD_CONST               0 (None)
             25 RETURN_VALUE
>>> 

As you can see, at offset 15 we have BUILD_MAP_UNPACK_WITH_CALL byte code which is responsible for the unpacking.

Now what happens that it returns 0 as the key argument to the __getitem__ method?

Whenever the interpreter encounters an exception during unpacking, which in this case is a KeyError, It stops continuing the push/pop flow and instead of returning the real value of your variable it returns the stack effect which is why the key is 0 at first and if you don't handle the exception each time you get an incremented result (due to the stack size).

Now if you do the same disassembly in Python-3.6 you'll get the following result:

>>> dis.dis(bar)
  1           0 LOAD_GLOBAL              0 (foo)
              2 BUILD_TUPLE              0
              4 LOAD_FAST                0 (data0)
              6 LOAD_CONST               1 ('b')
              8 LOAD_CONST               2 (4)
             10 BUILD_MAP                1
             12 BUILD_MAP_UNPACK_WITH_CALL     2
             14 CALL_FUNCTION_EX         1
             16 POP_TOP
             18 LOAD_CONST               0 (None)
             20 RETURN_VALUE

Before creating the local variables (LOAD_FAST) and after LOAD_GLOBAL there is a BUILD_TUPLE which is responsible for creating a tuple and consuming count items from the stack.

BUILD_TUPLE(count)

Creates a tuple consuming count items from the stack, and pushes the >resulting tuple onto the stack.

And this is, IMO, why you don't get a key error and instead you get TypeError. Because during the creation of a tuple of arguments it encounters a duplicate name and therefore, properly, returns the TypeError.

like image 157
Mazdak Avatar answered Sep 20 '22 10:09

Mazdak