Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the design reason for the fact that if __new__ does not return an instance of cls, python does not invoke __init__?

Tags:

python

There are many questions on SO asking why python doesn't always call __init__ after object creation. The answer, of course, is found in this excerpt from the documentation:

If __new__() returns an instance of cls, then the new instance’s __init__() method will be invoked like __init__(self[, ...]), where self is the new instance and the remaining arguments are the same as were passed to __new__().

If __new__() does not return an instance of cls, then the new instance’s __init__() method will not be invoked.

What is the design reason for this?

like image 313
DanielSank Avatar asked Jan 18 '15 00:01

DanielSank


2 Answers

__init__ does act like a constructor. It needs an instance to do its job like setting attributes and so on. If __new__ doesn't return explicitly returns an instance, then None is returned by default.

Imagine that what will happen when __init__ gets a None as an input and trying to set attributes? It will raise an exception called "AttributeError: 'NoneType' object has no attribute xxxxx".

So I think it's natural that not to invoke __init__ when __new__ returns None.

like image 131
Stephen Lin Avatar answered Oct 21 '22 13:10

Stephen Lin


In Python 2, you can't actually call a regular method with the first argument being anything other than an instance of the class (or a subclass):

class Foo(object):
    def __init__(self):
        pass

Foo.__init__()
# TypeError: unbound method __init__() must be called with Foo instance as first argument (got nothing instead)

Foo.__init__(3)
# TypeError: unbound method __init__() must be called with Foo instance as first argument (got int instance instead)

So __init__ isn't called because it cannot possibly do anything other than immediately raise an exception. Not trying to call it is strictly more useful (though I don't think I've ever seen code take advantage of this).

Python 3 has a slightly simpler method implementation, and this restriction is no longer in place, but the __new__ semantics are the same. It doesn't make a lot of sense to try to run a class's initializer on a foreign object, anyway.


For a more designy answer, rather than a "because it's this way" answer:

Overriding __new__ is already a weird thing to do. By default, it returns an uninitialized object, which is a concept that Python tries very hard to hide. If you override it, you're probably doing something like this:

class Foo(object):
    def __new__(cls, some_arg):
        if some_arg == 15:
            # 15 is a magic number for some reason!
            return Bar()
        else:
            return super(Foo, cls).__new__(cls, some_arg)

Let's imagine a Python variant that unconditionally called __init__ on the return value. I immediately see a number of problems.

When you return Bar(), should Python call Bar.__init__ (which has already been called in this case) or Foo.__init__ (which is for a completely different type and would break whatever guarantees Bar makes)?

The answer surely has to be that Bar.__init__ is called. Does that mean that you have to return an uninitialized Bar instance, using the mouthful return Bar.__new__(Bar) instead? Python very rarely requires you to call dunder methods outside of using super, so this would be highly unusual.

Where would Bar.__init__'s arguments come from? Both Foo.__new__ and Foo.__init__ are passed the same arguments — those passed to type.__call__, which is what handles Foo(...). But if you explicitly call Bar.__new__, there's nowhere to remember the arguments you wanted to pass to Bar.__init__. You can't store them on the new Bar object, because that's what Bar.__init__ is supposed to do! And if you just said it gets the same arguments that were passed to Foo, you severely limit what types can be returned from __new__.

Or, what if you wanted to return an object that already exists? Python has no way to indicate that an object is "already" initialized — since uninitialized objects are a transient and mostly-internal thing only of interest to __new__ — so you'd have no way to say not to call __init__ again.

The current approach is a little clumsy, but I don't think there's any better alternative. __new__ is supposed to create storage space for the new object, and returning a different type of object altogether is just a really weird thing to do; this is the least-surprising and most-useful way Python can handle it.

If this limitation is getting in your way, remember that the entire __new__ and __init__ dance is just the behavior of type.__call__. You're perfectly free to define your own __call__ behavior on a metaclass, or just swap your class out for a factory function.

like image 27
Eevee Avatar answered Oct 21 '22 15:10

Eevee