Building a string through repeated string concatenation is an anti-pattern, but I'm still curious why its performance switches from linear to quadratic after string length exceeds approximately 10 ** 6:
# this will take time linear in n with the optimization
# and quadratic time without the optimization
import time
start = time.perf_counter()
s = ''
for i in range(n):
s += 'a'
total_time = time.perf_counter() - start
time_per_iteration = total_time / n
For example, on my machine (Windows 10, python 3.6.1):
10 ** 4 < n < 10 ** 6
, the time_per_iteration
is almost perfectly constant at 170±10 µs10 ** 6 < n
, the time_per_iteration
is almost perfectly linear, reaching 520 µs at n == 10 ** 7
. Linear growth in time_per_iteration
is equivalent to quadratic growth in total_time
.
The linear complexity results from the optimization in the more recent CPython versions (2.4+) that reuse the original storage if no references remain to the original object. But I expected the linear performance to continue indefinitely rather than switch to quadratic at some point.
My question is based made on this comment. For some odd reason running
python -m timeit -s"s=''" "for i in range(10**7):s+='a'"
takes incredibly long time (much longer than quadratic), so I never got the actual timing results from timeit
. So instead, I used a simple loop as above to obtain performance numbers.
Update:
My question might as well have been titled "How can a list-like append
have O(1)
performance without over-allocation?". From observing constant time_per_iteration
on small-size strings, I assumed the string optimization must be over-allocating. But realloc
is (unexpectedly to me) quite successful at avoiding memory copy when extending small memory blocks.
If you concatenate Stings in loops for each iteration a new intermediate object is created in the String constant pool. This is not recommended as it causes memory issues.
For example, + is the integer addition operator, but can also perform string concatenation. + is the addition operator when both of the passed arguments are integers.
The simplest way of concatenating two or more strings in the Go language is by using + operator . It is also known as a concatenation operator. str1 = "Welcome!"
In the end, the platform C allocators (like malloc()
) are the ultimate source of memory. When CPython tries to reallocate string space to extend its size, it's really the system C realloc()
that determines the details of what happens. If the string is "short" to begin with, chances are decent the system allocator finds unused memory adjacent to it, so extending the size is just a matter of the C library's allocator updating some pointers. But after repeating this some number of times (depending again on details of the platform C allocator), it will run out of space. At that point, realloc()
will need to copy the entire string so far to a brand new larger block of free memory. That's the source of quadratic-time behavior.
Note, e.g., that growing a Python list faces the same tradeoffs. However, lists are designed to be grown, so CPython deliberately asks for more memory than is actually needed at the time. The amount of this overallocation scales up as the list grows, enough to make it rare that realloc()
needs to copy the whole list-so-far. But the string optimizations do not overallocate, making cases where realloc()
needs to copy far more frequent.
[XXXXXXXXXXXXXXXXXX............]
\________________/\__________/
used space reserved
space
When growing a contiguous array data structure (illustrated above) through appending to it, linear performance can be achieved if the extra space reserved while reallocating the array is proportional to the current size of the array. Obviously, for large strings this strategy is not followed, most probably with the purpose of not wasting too much memory. Instead a fixed amount of extra space is reserved during each reallocation, resulting in quadratic time complexity. To understand where the quadratic performance comes from in the latter case, imagine that no overallocation is performed at all (which is the boundary case of that strategy). Then at each iteration a reallocation (requiring linear time) must be performed, and the full runtime is quadratic.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With