Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does this code print a randomly selected attribute?

Today while writing some especially terrible code, I stumbled across this mysterious behavior. The Python 3 program below prints a randomly selected attribute of object. How does this happen?

An obvious suspect for the nondeterminism is the random ordering of the vars(object) dictionary, but I can't see how that causes the observed behavior. One hypothesis I had was that it was caused by the ordering of __setattr__ being overridden, but this is disproved by the fact that the lambda is always called only once (checked by print debugging).

class TypeUnion: 
    pass

class t: 
    pass

def super_serious(obj):
    proxy = t()
    for name, val in vars(object).items():
        if not callable(val) or type(val) is type: 
            continue
        try: 
            setattr(t, name, lambda _, *x, **y: val)
        except AttributeError: 
            pass
    return proxy

print(super_serious(TypeUnion()).x)

N.B. The above program is not attempting to do anything useful; it is heavily reduced from the original.

like image 971
feersum Avatar asked May 21 '17 01:05

feersum


3 Answers

Andrei Cioara's answer is largely correct:

  1. The randomness comes from Python 3.3 and later randomizing hash order by default (see Why is dictionary ordering non-deterministic?).

  2. Accessing x calls the lambda function that has been bound to __getattribute__.

See Difference between __getattr__ vs __getattribute__ and the Python3 datamodel reference notes for object.__getattribute__.

We can make this whole thing far less obfuscated with:

class t(object):
    def __getattribute__(self, name):
        use = None
        for val in vars(object).values():
            if callable(val) and type(val) is not type:
                use = val
        return use

def super_serious(obj):
    proxy = t()
    return proxy

which is sort of what happens with the lambda. Note that in the loop, we don't bind / save the current value of val.1 This means that we get the last value that val has in the function. With the original code, we do all this work at the time we create object t, rather than later when t.__getattribute__ gets called—but it still boils down to: Of <name, value> pairs in vars(object), find the last one that meets our criteria: the value must be callable, while the value's type is not itself type.

Using class t(object) makes t a new-style class object even in Python2, so that this code now "works" in Python2 as well as Python3. Of course, in Py2k, dictionary ordering is not randomized, so we always get the same thing every time:

$ python2 foo3.py
<slot wrapper '__init__' of 'object' objects>
$ python2 foo3.py
<slot wrapper '__init__' of 'object' objects>

vs:

$ python3 foo3.py
<slot wrapper '__eq__' of 'object' objects>
$ python3 foo3.py
<slot wrapper '__lt__' of 'object' objects>

Setting the environment variable PYTHONHASHSEED to 0 makes the order deterministic in Python3 as well:

$ PYTHONHASHSEED=0 python3 foo3.py
<method '__subclasshook__' of 'object' objects>
$ PYTHONHASHSEED=0 python3 foo3.py
<method '__subclasshook__' of 'object' objects>
$ PYTHONHASHSEED=0 python3 foo3.py
<method '__subclasshook__' of 'object' objects>

1To see what this is about, try the following:

def f():
    i = 0
    ret = lambda: i
    for i in range(3):
        pass
    return ret
func = f()
print('func() returns', func())

Note that it says func() returns 2, not func() return 0. Then replace the lambda line with:

    ret = lambda stashed=i: stashed

and run it again. Now the function returns 0. This is because we saved the current value of i here.

If we did this same sort of thing to the sample program, it would return the first val that meets the criteria, rather than the last one.

like image 119
torek Avatar answered Nov 12 '22 07:11

torek


Non-determinism comes from the randomness in the __dict__ returned by vars(object)

The print is a bit suspicious, since your TypeUnion does not have an 'x'

super_serious(TypeUnion()).x 

The reason why something is returned is because your for loop overwrites __getattribute__ and hence hijacks the dot. Adding this line would show that.

    if name == '__getattribute__':
        continue

Once the get is compromised, the set is dead as well. Think of it like this

setattr(t, name, lambda *x, **y: val)

Is conceptually the same as

t.__dict__[name] = lambda *x, **y: val

But the get now always returns the same reference, irrespective of the value of name, which is then overwritten. Therefore the final answer will be the last item in this iteration, which is random, since the for loop goes through the random order of the initial __dict__

Also, bear in mind that if your aim is to make a copy of the object, then the setattr is wrong. Calling the lambda would just return the original function but would not call the original function you would need something along the lines of

setattr(t, name, lambda *x, **y: val(*x, **y)  # Which doesn't work
like image 21
Andrei Cioara Avatar answered Nov 12 '22 07:11

Andrei Cioara


Yes, torek is correct in that your code doesn't bind the current value of val, so you get the last value assigned to val. Here is a version that "correctly" binds the value with a closure:

class TypeUnion: 
    pass

class t: 
    pass

def super_serious(obj):
    proxy = t()
    for name, val in vars(object).items():
        if not callable(val) or type(val) is type: 
            continue
        try: 
            setattr(t, name, (lambda v: lambda _, *x, **y: v)(val))
        except AttributeError: 
            pass
    return proxy

print(super_serious(TypeUnion()).x)

This will consistently output <slot wrapper '__getattribute__' of 'object' objects>, proving that the problem is that __getattribute__ is hijacked.

like image 3
Imperishable Night Avatar answered Nov 12 '22 09:11

Imperishable Night