Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SqlAlchemy metaclass confusion

I'm trying to inject some of my own code in the class construction process of SqlAlchemy. Trying to understand the code, I'm somewhat confused by the implementation of the metaclass. Here are the relevant snippets:

The default "metaclass" of SqlAlchemy:

class DeclarativeMeta(type):
    def __init__(cls, classname, bases, dict_):
        if '_decl_class_registry' in cls.__dict__:
            return type.__init__(cls, classname, bases, dict_)
        else:
            _as_declarative(cls, classname, cls.__dict__)
        return type.__init__(cls, classname, bases, dict_)

    def __setattr__(cls, key, value):
        _add_attribute(cls, key, value)

declarative_base is implemented like this:

def declarative_base(bind=None, metadata=None, mapper=None, cls=object,
                     name='Base', constructor=_declarative_constructor,
                     class_registry=None,
                     metaclass=DeclarativeMeta):
     # some code which should not matter here
     return metaclass(name, bases, class_dict)

It's used like this:

Base = declarative_base()

class SomeModel(Base):
    pass

Now I have derived my own metaclass like this:

class MyDeclarativeMeta(DeclarativeMeta):
    def __init__(cls, classname, bases, dict_):
        result = DeclarativeMeta.__init__(cls, classname, bases, dict_)
        print result
        # here I would add my custom code, which does not work
        return result

And use it like this:

Base = declarative_base(metaclass=MyDeclarativeMeta)

Ok, now to my problem:

  • print result in my own class always prints None.
  • The code seems to work anyway!?
  • Why is the metaclass using __init__ and not __new__
  • declarative_base returns an instance of this class. Shouldn't it return a class having an attribute __metaclass__ having MyDeclarativeMeta as value?

So I wonder why the code works at all. As the SqlAlchemy people obviously know what they are doing, I assume that I'm on the completly wrong track. Could somebody explain what's going on here?

like image 262
Achim Avatar asked Sep 30 '12 19:09

Achim


3 Answers

First things first. __init__ is required to return None. The Python docs say "no value may be returned", but in Python "dropping off the end" of a function without hitting a return statement is equivalent to return None. So explicitly returning None (either as a literal or by returning the value of an expression resulting in None) does no harm either.

So the __init__ method of DeclarativeMeta that you quote looks a little odd to me, but it doesn't do anything wrong. Here it is again with some comments added by me:

def __init__(cls, classname, bases, dict_):
    if '_decl_class_registry' in cls.__dict__:
        # return whatever type's (our superclass) __init__ returns
        # __init__ must return None, so this returns None, which is okay
        return type.__init__(cls, classname, bases, dict_)
    else:
        # call _as_declarative without caring about the return value
        _as_declarative(cls, classname, cls.__dict__)
    # then return whatever type's __init__ returns
    return type.__init__(cls, classname, bases, dict_)

This could more succinctly and cleanly be written as:

def __init__(cls, classname, bases, dict_):
    if '_decl_class_registry' not in cls.__dict__:
        _as_declarative(cls, classname, cls.__dict__)
    type.__init__(cls, classname, bases, dict_)

I have no idea why the SqlAlchemy developers felt the need to return whatever type.__init__ returns (which is constrained to be None). Perhaps it's proofing against a future when __init__ might return something. Perhaps it's just for consistency with other methods where the core implementation is by deferring to the superclass; usually you'd return whatever the superclass call returns unless you want to post-process it. However it certainly doesn't actually do anything.

So your print result printing None is just showing that everything is working as intended.


Next up, lets take a closer look at what metaclasses actually mean. A metaclass is just the class of a class. Like any class, you create instances of a metaclass (i.e. classes) by calling the metaclass. The class block syntax isn't really what creates classes, it's just very convenient syntactic sugar for defining a dictionary and then passing it to a metaclass invocation to create a class object.

The __metaclass__ attribute isn't what does the magic, it's really just a giant hack to communicate the information "I would like this class block to create an instance of this metaclass instead of an instance of type" through a back-channel, because there's no proper channel for communicating that information to the interpreter.1

This will probably be clearer with an example. Take the following class block:

class MyClass(Look, Ma, Multiple, Inheritance):
    __metaclass__ = MyMeta

    CLASS_CONST = 'some value'

    def __init__(self, x):
        self.x = x

    def some_method(self):
        return self.x - 76

This is roughly syntactic sugar for doing the following2:

dict_ = {}

dict_['__metaclass__'] = MyMeta
dict_['CLASS_CONST'] = 'some value'

def __init__(self, x):
    self.x = x
dict_['__init__'] = __init__

def some_method(self):
    return self.x - 76
dict_['some_method'] = some_method

metaclass = dict_.get('__metaclass__', type)
bases = (Look, Ma, Multiple, Inheritance)
classname = 'MyClass'

MyClass = metaclass(classname, bases, dict_)

So a "class having an attribute __metaclass__ having [the metaclass] as value" IS an instance of the metaclass! They are exactly the same thing. The only difference is that if you create the class directly (by calling the metaclass) rather than with a class block and the __metaclass__ attribute, then it doesn't necessarily have __metaclass__ as an attribute.3

That invocation of metaclass at the end is exactly like any other class invocation. It will call metaclass.__new__(classname, bases, dict_) to get create the class object, then call __init__ on the resulting object to initialise it.

The default metaclass, type, only does anything interesting in __new__. And most uses for metaclasses that I've seen in examples are really just a convoluted way of implementing class decorators; they want to do some processing when the class is created, and thereafter not care. So they use __new__ because it allows them to execute both before and after type.__new__. The net result is that everyone thinks that __new__ is what you implement in metaclasses.

But you can in fact have an __init__ method; it will be invoked on the new class object after it has been created. If you need to add some attributes the the class, or record the class object in a registry somewhere, this is actually a slightly more convenient place to do it (and the logically correct place) than __new__.


1 In Python3 this is addressed by adding metaclass as a "keyword argument" in the base-class list, rather than as a an attribute of the class.

2 In reality it's slightly more complicated due to the need for metaclass compatibility between the class being constructed and all the bases, but this is the core idea.

3 Not that even a class with a metaclass (other than type) created the usual way necessarily has to have __metaclass__ as an attribute; the correct way to check the class of a class is the same way as checking the class of anything else; use cls.__class__, or apply type(cls).

like image 84
Ben Avatar answered Nov 01 '22 18:11

Ben


the __init__ in SQLAlchemy's version is wrong, basically. It probably got written like that three years ago by cutting and pasting a metaclass from somewhere, or perhaps it started out as a different method that became __init__ later, and has just not been changed. I just checked 0.5 when it was first written and it looks mostly the same, with the unnecessary "return" statement. Fixing it now, sorry it confused you.

like image 40
zzzeek Avatar answered Nov 01 '22 20:11

zzzeek


  • print result in my own class always prints None.

This is because constructor doesn't return anything :)

  • Why is the metaclass using __init__ and not __new__

I think it is because SQLAlchemy needs to store a reference of cls into the declarative class registry. In __new__, the class doesn't exist yet (see https://stackoverflow.com/a/1840466).

When I subclassed DeclarativeMeta, I actually did everything in __init__ following SQLAlchemy's code. In retrospect after reading your question, my code should use __new__ instead.

  • declarative_base returns an instance of this class. Shouldn't it return a class having an attribute __metaclass__ having MyDeclarativeMeta as value?

I think Ben explained this very well. Anyway, if you want (not recommended), you can skip calling declarative_base() and create your own Base class, e.g.

# Almost the same as:
#   Base = declarative_base(cls=Entity, name='Base', metaclass=MyDeclarativeMeta)
# minus the _declarative_constructor.
class Base(Entity):
    __metaclass__ = MyDeclarativeMeta

    _decl_class_registry = dict()
    metadata = MetaData()

In this case, the __metaclass__ attribute will be there. I actually created my Base class like this to help PyCharm getting the auto-completion for things defined in Entity.

like image 6
sayap Avatar answered Nov 01 '22 20:11

sayap