So it's a CPython thing, not quite sure that it has same behaviour with other implementations.
But '{0}'.format()
is faster than str()
and '{}'.format()
. I'm posting results from Python 3.5.2, but, I tried it with Python 2.7.12 and the trend is the same.
%timeit q=['{0}'.format(i) for i in range(100, 100000, 100)]
%timeit q=[str(i) for i in range(100, 100000, 100)]
%timeit q=['{}'.format(i) for i in range(100, 100000, 100)]
1000 loops, best of 3: 231 µs per loop
1000 loops, best of 3: 298 µs per loop
1000 loops, best of 3: 434 µs per loop
From the docs on object.__str__(self)
Called by
str(object)
and the built-in functionsformat()
andprint()
to compute the “informal” or nicely printable string representation of an object.
So, str()
and format()
call same object.__str__(self)
method, but where does that difference in speed come from?
UPDATE
as @StefanPochmann and @Leon noted in comments, they get different results. I tried to run it with python -m timeit "..."
and, they are right, because the results are:
$ python3 -m timeit "['{0}'.format(i) for i in range(100, 100000, 100)]"
1000 loops, best of 3: 441 usec per loop
$ python3 -m timeit "[str(i) for i in range(100, 100000, 100)]"
1000 loops, best of 3: 297 usec per loop
$ python3 -m timeit "['{}'.format(i) for i in range(100, 100000, 100)]"
1000 loops, best of 3: 420 usec per loop
So it seems that IPython is doing something strange...
NEW QUESTION: What is preferred way to convert an object to str
by speed?
The IPython timing is just off for some reason (though, when tested with a longer format string in different cells, it behaved slightly better). Maybe executing in the same cells isn't right, don't really know.
Either way, "{}"
is a bit faster than "{pos}"
which is faster than "{name}"
while they're all slower than str
.
str(val)
is the fastest way to transform an object to str
; it directly calls the objects' __str__
, if one exists, and returns the resulting string. Others, like format
, (or str.format
) include additional overhead due to an extra function call (to format
itself); handling any arguments, parsing the format string and then invoking the __str__
of their args
.
For the str.format
methods "{}"
uses automatic numbering; from a small section in the docs on the format syntax:
Changed in version 3.1: The positional argument specifiers can be omitted, so
'{} {}'
is equivalent to'{0} {1}'
.
that is, if you supply a string of the form:
"{}{}{}".format(1, 2, 3)
CPython immediately knows that this is equivalent to:
"{0}{1}{2}".format(1, 2, 3)
With a format string that contains numbers indicating positions; CPython can't assume a strictly increasing number (that starts from 0
) and must parse every single bracket in order to get it right, slowing things down a bit in the process:
"{1}{2}{0}".format(1, 2, 3)
That's why it also is not allowed to mix these two together:
"{1}{}{2}".format(1, 2, 3)
you'll get a nice ValueError
back when you attempt to do so:
ValueError: cannot switch from automatic field numbering to manual field specification
it also grabs these positionals with PySequence_GetItem
which I'm pretty sure is fast, at least, in comparison to PyObject_GetItem
[see next].
For "{name}"
values, CPython always has extra work to do due to the fact that we're dealing with keyword arguments rather than positional ones; this includes things like building the dictionary for the calls and generating way more LOAD
byte-code instructions for loading key
s and values. The keyword form of function calling always introduces some overhead. In addition, it seems that the grabbing actually uses PyObject_GetItem
which incurs some extra overhead due to its generic nature.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With