Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why attribute lookup in Python is designed this way (precedence chain)?

I've just ran into the descriptors in Python, and I got the ideas about the descriptor protocol on "__get__, __set__, __delete__", and it really did a great job on wrapping methods.

However, in the protocol, there're other rules:

Data and non-data descriptors differ in how overrides are calculated with respect to entries in an instance’s dictionary. If an instance’s dictionary has an entry with the same name as a data descriptor, the data descriptor takes precedence. If an instance’s dictionary has an entry with the same name as a non-data descriptor, the dictionary entry takes precedence.

I don't get the point, isn't it ok just to look up in the classic way(instance dictionary -> class dictionary -> base class dictionary)?
And if implement this way, data descriptors can be hold by instances, and the descriptor itself do not have to hold a weakrefdict to hold values for different instances of the owner.
Why put descriptors into the lookup chain? And why the data descriptor is placed in the very beginning?

like image 276
hisen Avatar asked Jan 27 '16 07:01

hisen


1 Answers

Let's see on an example:

class GetSetDesc(object):
    def __init__(self, value):
        self.value=value

    def __get__(self, obj, objtype):
        print("get_set_desc: Get")
        return self.value

    def __set__(self, obj, value):
        print("get_set_desc: Set")
        self.value=value

class SetDesc(object):
    def __init__(self, value):
        self.value=value

    def __set__(self, obj, value):
        print("set_desc: Set")
        self.value=value

class GetDesc(object):
    def __init__(self, value):
        self.value=value

    def __get__(self, obj, objtype):
        print("get_desc: Get")
        return self.value

class Test1(object):
    attr=10
    get_set_attr=10
    get_set_attr=GetSetDesc(5)
    set_attr=10
    set_attr=SetDesc(5)
    get_attr=10
    get_attr=GetDesc(5)

class Test2(object):
    def __init__(self):
        self.attr=10
        self.get_set_attr=10
        self.get_set_attr=GetSetDesc(5)
        self.set_attr=10
        self.set_attr=SetDesc(5)
        self.get_attr=10
        self.get_attr=GetDesc(5)

class Test3(Test1):
    def __init__(self):
        #changing values to see differce with superclass
        self.attr=100
        self.get_set_attr=100
        self.get_set_attr=GetSetDesc(50)
        self.set_attr=100
        self.set_attr=SetDesc(50)
        self.get_attr=100
        self.get_attr=GetDesc(50)

class Test4(Test1):
    pass


print("++Test 1 Start++")
t=Test1()

print("t.attr:", t.attr)
print("t.get_set_desc:", t.get_set_attr)
print("t.set_attr:", t.set_attr)
print("t.get_attr:", t.get_attr)

print("Class dict attr:", t.__class__.__dict__['attr'])
print("Class dict get_set_attr:", t.__class__.__dict__['get_set_attr'])
print("Class dict set_attr:", t.__class__.__dict__['set_attr'])
print("Class dict get_attr:", t.__class__.__dict__['get_attr'])

#These will obviously fail as instance dict is empty here
#print("Instance dict attr:", t.__dict__['attr'])
#print("Instance dict get_set_attr:", t.__dict__['get_set_attr'])
#print("Instance dict set_attr:", t.__dict__['set_attr'])
#print("Instance dict get_attr:", t.__dict__['get_attr'])

t.attr=20
t.get_set_attr=20
t.set_attr=20
t.get_attr=20

print("t.attr:", t.attr)
print("t.get_set_desc:", t.get_set_attr)
print("t.set_attr:", t.set_attr)
print("t.get_attr:", t.get_attr)

print("Class dict attr:", t.__class__.__dict__['attr'])
print("Class dict get_set_attr:", t.__class__.__dict__['get_set_attr'])
print("Class dict set_attr:", t.__class__.__dict__['set_attr'])
print("Class dict get_attr:", t.__class__.__dict__['get_attr'])

print("Instance dict attr:", t.__dict__['attr'])
#Next two will fail,
#because the descriptor for those variables has __set__
#on the class itself which was called with value 20,
#so the instance is not affected
#print("Instance dict get_set_attr:", t.__dict__['get_set_attr'])
#print("Instance dict set_attr:", t.__dict__['set_attr'])
print("Instance dict get_attr:", t.__dict__['get_attr'])

print("++Test 1 End++")


print("++Test 2 Start++")
t2=Test2()

print("t.attr:", t2.attr)
print("t.get_set_desc:", t2.get_set_attr)
print("t.set_attr:", t2.set_attr)
print("t.get_attr:", t2.get_attr)

#In this test the class is not affected, so these will fail
#print("Class dict attr:", t2.__class__.__dict__['attr'])
#print("Class dict get_set_attr:", t2.__class__.__dict__['get_set_attr'])
#print("Class dict set_attr:", t2.__class__.__dict__['set_attr'])
#print("Class dict get_attr:", t2.__class__.__dict__['get_attr'])

print("Instance dict attr:", t2.__dict__['attr'])
print("Instance dict get_set_attr:", t2.__dict__['get_set_attr'])
print("Instance dict set_attr:", t2.__dict__['set_attr'])
print("Instance dict get_attr:", t2.__dict__['get_attr'])

t2.attr=20
t2.get_set_attr=20
t2.set_attr=20
t2.get_attr=20

print("t.attr:", t2.attr)
print("t.get_set_desc:", t2.get_set_attr)
print("t.set_attr:", t2.set_attr)
print("t.get_attr:", t2.get_attr)

#In this test the class is not affected, so these will fail
#print("Class dict attr:", t2.__class__.__dict__['attr'])
#print("Class dict get_set_attr:", t2.__class__.__dict__['get_set_attr'])
#print("Class dict set_attr:", t2.__class__.__dict__['set_attr'])
#print("Class dict get_attr:", t2.__class__.__dict__['get_attr'])

print("Instance dict attr:", t2.__dict__['attr'])
print("Instance dict get_set_attr:", t2.__dict__['get_set_attr'])
print("Instance dict set_attr:", t2.__dict__['set_attr'])
print("Instance dict get_attr:", t2.__dict__['get_attr'])

print("++Test 2 End++")


print("++Test 3 Start++")
t3=Test3()

print("t.attr:", t3.attr)
print("t.get_set_desc:", t3.get_set_attr)
print("t.set_attr:", t3.set_attr)
print("t.get_attr:", t3.get_attr)

#These fail, because nothing is defined on Test3 class itself, but let's see its super below
#print("Class dict attr:", t3.__class__.__dict__['attr'])
#print("Class dict get_set_attr:", t3.__class__.__dict__['get_set_attr'])
#print("Class dict set_attr:", t3.__class__.__dict__['set_attr'])
#print("Class dict get_attr:", t3.__class__.__dict__['get_attr'])

#Checking superclass
print("Superclass dict attr:", t3.__class__.__bases__[0].__dict__['attr'])
print("Superclass dict get_set_attr:", t3.__class__.__bases__[0].__dict__['get_set_attr'])
print("Superclass dict set_attr:", t3.__class__.__bases__[0].__dict__['set_attr'])
print("Superclass dict get_attr:", t3.__class__.__bases__[0].__dict__['get_attr'])

print("Instance dict attr:", t3.__dict__['attr'])
#Next two with __set__ inside descriptor fail, because
#when the instance was created, the value inside the descriptor in superclass
#was redefined via __set__
#print("Instance dict get_set_attr:", t3.__dict__['get_set_attr'])
#print("Instance dict set_attr:", t3.__dict__['set_attr'])
print("Instance dict get_attr:", t3.__dict__['get_attr'])
#The one above does not fail, because it doesn't have __set__ in
#descriptor in superclass and therefore was redefined on instance

t3.attr=200
t3.get_set_attr=200
t3.set_attr=200
t3.get_attr=200

print("t.attr:", t3.attr)
print("t.get_set_desc:", t3.get_set_attr)
print("t.set_attr:", t3.set_attr)
print("t.get_attr:", t3.get_attr)

#print("Class dict attr:", t3.__class__.__dict__['attr'])
#print("Class dict get_set_attr:", t3.__class__.__dict__['get_set_attr'])
#print("Class dict set_attr:", t3.__class__.__dict__['set_attr'])
#print("Class dict get_attr:", t3.__class__.__dict__['get_attr'])

#Checking superclass
print("Superclass dict attr:", t3.__class__.__bases__[0].__dict__['attr'])
print("Superclass dict get_set_attr:", t3.__class__.__bases__[0].__dict__['get_set_attr'])
print("Superclass dict set_attr:", t3.__class__.__bases__[0].__dict__['set_attr'])
print("Superclass dict get_attr:", t3.__class__.__bases__[0].__dict__['get_attr'])

print("Instance dict attr:", t3.__dict__['attr'])
#Next two fail, they are in superclass, not in instance
#print("Instance dict get_set_attr:", t3.__dict__['get_set_attr'])
#print("Instance dict set_attr:", t3.__dict__['set_attr'])
print("Instance dict get_attr:", t3.__dict__['get_attr'])
#The one above succeds as it was redefined as stated in prior check

print("++Test 3 End++")


print("++Test 4 Start++")
t4=Test4()

print("t.attr:", t4.attr)
print("t.get_set_desc:", t4.get_set_attr)
print("t.set_attr:", t4.set_attr)
print("t.get_attr:", t4.get_attr)

#These again fail, as everything defined in superclass, not the class itself
#print("Class dict attr:", t4.__class__.__dict__['attr'])
#print("Class dict get_set_attr:", t4.__class__.__dict__['get_set_attr'])
#print("Class dict set_attr:", t4.__class__.__dict__['set_attr'])
#print("Class dict get_attr:", t4.__class__.__dict__['get_attr'])

#Checking superclass
print("Superclass dict attr:", t4.__class__.__bases__[0].__dict__['attr'])
print("Superclass dict get_set_attr:", t4.__class__.__bases__[0].__dict__['get_set_attr'])
print("Superclass dict set_attr:", t4.__class__.__bases__[0].__dict__['set_attr'])
print("Superclass dict get_attr:", t4.__class__.__bases__[0].__dict__['get_attr'])

#Again, everything is on superclass, not the instance
#print("Instance dict attr:", t4.__dict__['attr'])
#print("Instance dict get_set_attr:", t4.__dict__['get_set_attr'])
#print("Instance dict set_attr:", t4.__dict__['set_attr'])
#print("Instance dict get_attr:", t4.__dict__['get_attr'])

t4.attr=200
t4.get_set_attr=200
t4.set_attr=200
t4.get_attr=200

print("t.attr:", t4.attr)
print("t.get_set_desc:", t4.get_set_attr)
print("t.set_attr:", t4.set_attr)
print("t.get_attr:", t4.get_attr)

#Class is not affected by those assignments, next four fail
#print("Class dict attr:", t4.__class__.__dict__['attr'])
#print("Class dict get_set_attr:", t4.__class__.__dict__['get_set_attr'])
#print("Class dict set_attr:", t4.__class__.__dict__['set_attr'])
#print("Class dict get_attr:", t4.__class__.__dict__['get_attr'])

#Checking superclass
print("Superclass dict attr:", t4.__class__.__bases__[0].__dict__['attr'])
print("Superclass dict get_set_attr:", t4.__class__.__bases__[0].__dict__['get_set_attr'])
print("Superclass dict set_attr:", t4.__class__.__bases__[0].__dict__['set_attr'])
print("Superclass dict get_attr:", t4.__class__.__bases__[0].__dict__['get_attr'])

#Now, this one we redefined it succeeds
print("Instance dict attr:", t4.__dict__['attr'])
#This one fails it's still on superclass
#print("Instance dict get_set_attr:", t4.__dict__['get_set_attr'])
#Same here - fails, it's on superclass, because it has __set__
#print("Instance dict set_attr:", t4.__dict__['set_attr'])
#This one succeeds, no __set__ to call, so it was redefined on instance
print("Instance dict get_attr:", t4.__dict__['get_attr'])

print("++Test 4 End++")

The output:

++Test 1 Start++
t.attr: 10
get_set_desc: Get
t.get_set_desc: 5
t.set_attr: <__main__.SetDesc object at 0x02896ED0>
get_desc: Get
t.get_attr: 5
Class dict attr: 10
Class dict get_set_attr: <__main__.GetSetDesc object at 0x02896EB0>
Class dict set_attr: <__main__.SetDesc object at 0x02896ED0>
Class dict get_attr: <__main__.GetDesc object at 0x02896EF0>
get_set_desc: Set
set_desc: Set
t.attr: 20
get_set_desc: Get
t.get_set_desc: 20
t.set_attr: <__main__.SetDesc object at 0x02896ED0>
t.get_attr: 20
Class dict attr: 10
Class dict get_set_attr: <__main__.GetSetDesc object at 0x02896EB0>
Class dict set_attr: <__main__.SetDesc object at 0x02896ED0>
Class dict get_attr: <__main__.GetDesc object at 0x02896EF0>
Instance dict attr: 20
Instance dict get_attr: 20
++Test 1 End++
++Test 2 Start++
t.attr: 10
t.get_set_desc: <__main__.GetSetDesc object at 0x028A0350>
t.set_attr: <__main__.SetDesc object at 0x028A0370>
t.get_attr: <__main__.GetDesc object at 0x028A0330>
Instance dict attr: 10
Instance dict get_set_attr: <__main__.GetSetDesc object at 0x028A0350>
Instance dict set_attr: <__main__.SetDesc object at 0x028A0370>
Instance dict get_attr: <__main__.GetDesc object at 0x028A0330>
t.attr: 20
t.get_set_desc: 20
t.set_attr: 20
t.get_attr: 20
Instance dict attr: 20
Instance dict get_set_attr: 20
Instance dict set_attr: 20
Instance dict get_attr: 20
++Test 2 End++
++Test 3 Start++
get_set_desc: Set
get_set_desc: Set
set_desc: Set
set_desc: Set
t.attr: 100
get_set_desc: Get
t.get_set_desc: <__main__.GetSetDesc object at 0x02896FF0>
t.set_attr: <__main__.SetDesc object at 0x02896ED0>
t.get_attr: <__main__.GetDesc object at 0x028A03F0>
Superclass dict attr: 10
Superclass dict get_set_attr: <__main__.GetSetDesc object at 0x02896EB0>
Superclass dict set_attr: <__main__.SetDesc object at 0x02896ED0>
Superclass dict get_attr: <__main__.GetDesc object at 0x02896EF0>
Instance dict attr: 100
Instance dict get_attr: <__main__.GetDesc object at 0x028A03F0>
get_set_desc: Set
set_desc: Set
t.attr: 200
get_set_desc: Get
t.get_set_desc: 200
t.set_attr: <__main__.SetDesc object at 0x02896ED0>
t.get_attr: 200
Superclass dict attr: 10
Superclass dict get_set_attr: <__main__.GetSetDesc object at 0x02896EB0>
Superclass dict set_attr: <__main__.SetDesc object at 0x02896ED0>
Superclass dict get_attr: <__main__.GetDesc object at 0x02896EF0>
Instance dict attr: 200
Instance dict get_attr: 200
++Test 3 End++
++Test 4 Start++
t.attr: 10
get_set_desc: Get
t.get_set_desc: 200
t.set_attr: <__main__.SetDesc object at 0x02896ED0>
get_desc: Get
t.get_attr: 5
Superclass dict attr: 10
Superclass dict get_set_attr: <__main__.GetSetDesc object at 0x02896EB0>
Superclass dict set_attr: <__main__.SetDesc object at 0x02896ED0>
Superclass dict get_attr: <__main__.GetDesc object at 0x02896EF0>
get_set_desc: Set
set_desc: Set
t.attr: 200
get_set_desc: Get
t.get_set_desc: 200
t.set_attr: <__main__.SetDesc object at 0x02896ED0>
t.get_attr: 200
Superclass dict attr: 10
Superclass dict get_set_attr: <__main__.GetSetDesc object at 0x02896EB0>
Superclass dict set_attr: <__main__.SetDesc object at 0x02896ED0>
Superclass dict get_attr: <__main__.GetDesc object at 0x02896EF0>
Instance dict attr: 200
Instance dict get_attr: 200
++Test 4 End++

Try it yourself to get a taste on descriptors. But bottomline, what we see here is...

First, definition from official docs to refresh the memory:

If an object defines both __get__() and __set__(), it is considered a data descriptor. Descriptors that only define __get__() are called non-data descriptors (they are typically used for methods but other uses are possible).

From the output and failed snippets...

It's clear that before the name referencing the descriptor(any type) is reassigned, the descriptor is looked up as usual following MRO from the class level to superclasses to the place where it was defined. (See Test 2, where it's defined in the instance and doesn't get called, but gets redefined with simple value.)

Now when the name is reassigned, things start to be interesting:

If it's a data descriptor (has __set__), then really no magic happens and the value assigned to the variable referencing the descriptor is passes to descriptors's __set__ and is used inside this method (regarding the code above it's assigned to self.value). The descriptor is first looked up in hierarchy ofc. Btw, the descriptor without __get__ is returned itself, not the value used with its __set__ method.

If it's a non-data descriptor (has only __get__), then it's looked up, but having no __set__ method it's "droped", and the variable referencing this descriptor gets reassigned at lowest possible level (instance or subclass, depending on where we define it).

So descriptors are used to control, modify, the data assigned to variables, which are made descriptors. So this makes sence, if a descriptor is a data descriptor, which defines __set__, it probably wants to parse the data you pass and hence gets called prior to instance dictionary key assignment. That's why it's put in hierarchy at first place. On the other hand, if it's a non-data descriptor with only __get__, it probably doesn't care for setting the data, and even more - it can't do anything to the data beign set, so it falls off from the chain on assignment and the data gets assigned to instance dictionary key.

Also, new style classes are all about MRO (Method Resolution Order), so it affects every feature - descriptor, properties (which are in fact descriptors too), special methods, etc. Descriptors are basicly methods, that get called on assignment or attribute read, so it makes sence, that they are looked up at class level as any other method is expected to.

If you need to control assignment, but refuse any change to variable use a data descriptor, but raise and exception in its __set__ method.

like image 196
Nikita Avatar answered Nov 15 '22 23:11

Nikita