The join()
function accepts an iterable as parameter. However, I was wondering why having:
text = 'asdfqwer'
This:
''.join([c for c in text])
Is significantly faster than:
''.join(c for c in text)
The same occurs with long strings (i.e. text * 10000000
).
Watching the memory footprint of both executions with long strings, I think they both create one and only one list of chars in memory, and then join them into a string. So I am guessing perhaps the difference is only between how join()
creates this list out of the generator and how the Python interpreter does the same thing when it sees [c for c in text]
. But, again, I am just guessing, so I would like somebody to confirm/deny my guesses.
The join
method reads its input twice; once to determine how much memory to allocate for the resulting string object, then again to perform the actual join. Passing a list is faster than passing a generator object that it needs to make a copy of so that it can iterate over it twice.
A list comprehension is not simply a generator object wrapped in a list, so constructing the list externally is faster than having join
create it from a generator object. Generator objects are optimized for memory efficiency, not speed.
Of course, a string is already an iterable object, so you could just write ''.join(text)
. (Also again this is not as fast as creating the list explicitly from the string.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With