I am wondering why repr(int) is faster than str(int). With the following code snippet:
ROUNDS = 10000
def concat_strings_str():
return ''.join(map(str, range(ROUNDS)))
def concat_strings_repr():
return ''.join(map(repr, range(ROUNDS)))
%timeit concat_strings_str()
%timeit concat_strings_repr()
I get these timings (python 3.5.2, but very similar results with 2.7.12):
1.9 ms ± 17.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
1.38 ms ± 9.07 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
If I'm on the right path, the same function long_to_decimal_string is getting called below the hood.
Did I get something wrong or what else is going on that I am missing?
update:
This probably has nothing to with int's __repr__ or __str__ methods but with the differences between repr() and str(), as int.__str__ and int.__repr__ are in fact comparably fast:
def concat_strings_str():
return ''.join([one.__str__() for one in range(ROUNDS)])
def concat_strings_repr():
return ''.join([one.__repr__() for one in range(ROUNDS)])
%timeit concat_strings_str()
%timeit concat_strings_repr()
results in:
2.02 ms ± 24.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
2.05 ms ± 7.07 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
repr() compute the “official” string representation of an object (a representation that has all information about the object) and str() is used to compute the “informal” string representation of an object (a representation that is useful for printing the object).
Both str() and repr() return a “textual representation” of a Python object. The difference is: str() gives a user-friendly representation. repr() gives a developer-friendly representation.
Summary. Both __str__ and __repr__ functions return string representation of the object. The __str__ string representation is supposed to be human-friendly and mostly used for logging purposes, whereas __repr__ representation is supposed to contain information about object so that it can be constructed again.
According to the official documentation, __repr__ is used to compute the “official” string representation of an object and is typically used for debugging.
Because using str(obj) must first go through type.__call__ then str.__new__ (create a new string) then PyObject_Str (make a string out of the object) which invokes int.__str__ and, finally, uses the function you linked.
repr(obj), which corresponds to builtin_repr, directly calls PyObject_Repr (get the object repr) which then calls int.__repr__ which uses the same function as int.__str__.
Additionally, the path they take through call_function (the function that handles the CALL_FUNCTION opcode that's generated for calls) is slightly different.
From the master branch on GitHub (CPython 3.7):
str goes through _PyObject_FastCallKeywords (which is the one that calls type.__call__). Apart from performing more checks, this also needs to create a tuple to hold the positional arguments (see _PyStack_AsTuple). repr goes through _PyCFunction_FastCallKeywords which calls _PyMethodDef_RawFastCallKeywords. repr is also lucky because, since it only accepts a single argument (the switch leads it to the METH_0 case in _PyMethodDef_RawFastCallKeywords) there's no need to create a tuple, just indexing of the args. As your update states, this isn't about int.__repr__ vs int.__str__, they are the same function after all; it's all about how repr and str reach them. str just needs to work a bit harder.
I just compared the str and repr implementations in the 3.5 branch.
See here.
There seems to be more checks in str:
There are several possibilities because the CPython functions that are responsible for the str and repr return are slightly different.
But I guess the primary reason is that str is a type (a class) and the str.__new__ method has to call __str__ while repr can directly go to __repr__.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With