I am wondering why repr(int)
is faster than str(int)
. With the following code snippet:
ROUNDS = 10000
def concat_strings_str():
return ''.join(map(str, range(ROUNDS)))
def concat_strings_repr():
return ''.join(map(repr, range(ROUNDS)))
%timeit concat_strings_str()
%timeit concat_strings_repr()
I get these timings (python 3.5.2, but very similar results with 2.7.12):
1.9 ms ± 17.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
1.38 ms ± 9.07 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
If I'm on the right path, the same function long_to_decimal_string
is getting called below the hood.
Did I get something wrong or what else is going on that I am missing?
update:
This probably has nothing to with int
's __repr__
or __str__
methods but with the differences between repr()
and str()
, as int.__str__
and int.__repr__
are in fact comparably fast:
def concat_strings_str():
return ''.join([one.__str__() for one in range(ROUNDS)])
def concat_strings_repr():
return ''.join([one.__repr__() for one in range(ROUNDS)])
%timeit concat_strings_str()
%timeit concat_strings_repr()
results in:
2.02 ms ± 24.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
2.05 ms ± 7.07 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
repr() compute the “official” string representation of an object (a representation that has all information about the object) and str() is used to compute the “informal” string representation of an object (a representation that is useful for printing the object).
Both str() and repr() return a “textual representation” of a Python object. The difference is: str() gives a user-friendly representation. repr() gives a developer-friendly representation.
Summary. Both __str__ and __repr__ functions return string representation of the object. The __str__ string representation is supposed to be human-friendly and mostly used for logging purposes, whereas __repr__ representation is supposed to contain information about object so that it can be constructed again.
According to the official documentation, __repr__ is used to compute the “official” string representation of an object and is typically used for debugging.
Because using str(obj)
must first go through type.__call__
then str.__new__
(create a new string) then PyObject_Str
(make a string out of the object) which invokes int.__str__
and, finally, uses the function you linked.
repr(obj)
, which corresponds to builtin_repr
, directly calls PyObject_Repr
(get the object repr) which then calls int.__repr__
which uses the same function as int.__str__
.
Additionally, the path they take through call_function
(the function that handles the CALL_FUNCTION
opcode that's generated for calls) is slightly different.
From the master branch on GitHub (CPython 3.7):
str
goes through _PyObject_FastCallKeywords
(which is the one that calls type.__call__
). Apart from performing more checks, this also needs to create a tuple to hold the positional arguments (see _PyStack_AsTuple
). repr
goes through _PyCFunction_FastCallKeywords
which calls _PyMethodDef_RawFastCallKeywords
. repr
is also lucky because, since it only accepts a single argument (the switch leads it to the METH_0
case in _PyMethodDef_RawFastCallKeywords
) there's no need to create a tuple, just indexing of the args. As your update states, this isn't about int.__repr__
vs int.__str__
, they are the same function after all; it's all about how repr
and str
reach them. str
just needs to work a bit harder.
I just compared the str
and repr
implementations in the 3.5 branch.
See here.
There seems to be more checks in str
:
There are several possibilities because the CPython functions that are responsible for the str
and repr
return are slightly different.
But I guess the primary reason is that str
is a type
(a class) and the str.__new__
method has to call __str__
while repr
can directly go to __repr__
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With