In Python, when subclassing tuple, the __new__
function is called with self as an argument. For example, here is a paraphrased version of PySpark's Row
class:
class Row(tuple):
def __new__(self, args):
return tuple.__new__(self, args)
But help(tuple)
shows no self
argument to __new__
:
__new__(*args, **kwargs) from builtins.type
Create and return a new object. See help(type) for accurate signature.
and help(type)
just says the same thing:
__new__(*args, **kwargs)
Create and return a new object. See help(type) for accurate signature.
So how does self
get passed to __new__
in the Row
class definition?
*args
? __new__
have some subtlety where its signature can change with context?Is it possible to view the source of tuple.__new__
so I can see the answer for myself?
My question is not a duplicate of this one because in that question, all discussion refers to __new__
methods that explicitly have self
or cls
as first argument. I'm trying to understand
tuple.__new__
method does not have self
or cls
as first argument.tuple.__new__
Functions and types implemented in C often can't be inspected, and their signature always look like that one.
The correct signature of tuple.__new__
is:
__new__(cls[, sequence])
For example:
>>> tuple.__new__(tuple)
()
>>> tuple.__new__(tuple, [1, 2, 3])
(1, 2, 3)
Not surprisingly, this is exactly as calling tuple()
, except for the fact that you have to repeat tuple
twice.
__new__
Note that the first argument of __new__
is always the class, not the instance. In fact, the role of __new__
is to create and return the new instance.
The special method __new__
is a static method.
I'm saying this because in your Row.__new__
I can see self
: while the name of the argument is not important (except when using keyword arguments), beware that self
will be Row
or a subclass of Row
, not an instance. The general convention is to name the first argument cls
instead of self
.
So how does
self
get passed to__new__
in theRow
class definition?
When you call Row(...)
, Python automatically calls Row.__new__(Row, ...)
.
- Is it via
*args
?
You can write your Row.__new__
as follows:
class Row(tuple):
def __new__(*args, **kwargs):
return tuple.__new__(*args, **kwargs)
This works and there's nothing wrong about it. It's very useful if you don't care about the arguments.
- Does
__new__
have some subtlety where its signature can change with context?
No, the only special thing about __new__
is that it is a static method.
- Or, is the documentation mistaken?
I'd say that it is incomplete or ambiguous.
- Why the
tuple.__new__
method does not haveself
orcls
as first argument.
It does have, it's just not appearing on help(tuple.__new__)
, because often that information is not exposed by functions and methods implemented in C.
- How I might go about examining the source code of the
tuple
class, to see for myself what's really going on.
The file you are looking for is Objects/tupleobject.c
. Specifically, you are interested in the tuple_new()
function:
static char *kwlist[] = {"sequence", 0};
/* ... */
if (!PyArg_ParseTupleAndKeywords(args, kwds, "|O:tuple", kwlist, &arg))
Here "|O:tuple"
means: the function is called "tuple" and it accepts one optional argument (|
delimits optional arguments, O
stands for a Python object). The optional argument may be set via the keyword argument sequence
.
help(type)
For the reference, you were looking at the documentation of type.__new__
, while you should have stopped at the first four lines of help(type)
:
In the case of __new__()
the correct signature is the signature of type()
:
class type(object)
| type(object_or_name, bases, dict)
| type(object) -> the object's type
| type(name, bases, dict) -> a new type
But this is not relevant, as tuple.__new__
has a different signature.
super()
!Last but not least, try to use super()
instead of calling tuple.__new__()
directly.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With