I want to know why str += "A"
and str = str + "A"
have different performances.
In practice,
string str = "cool"
for(int i = 0; i < 1000000; i++) str += "A" // -> approximately 15000~20000 ms
for(int i = 0; i < 1000000; i++) str = str + "A" // -> 10000000 ms
According to what I've looked for, str = str + "A"
have to copy 'str' so each iteration concatenation needs O(N)
so if you iterate N times, time complexity is O(N^2)
.
However, str += "A"
is only concatenating "A" to 'str' so it doesn't need to do any copy. So if iterate N times, only O(N)
.
I think it's weird. When I inspected the assembly code, they were the same. I think compiler translates str += "A"
to str = str + "A"
so the result should be the same.
If you look at the actual results, I'm wrong, but can you tell where I'm wrong?
Here is my ref link.
Here's my code for the timings:
#include<bits/stdc++.h>
#include<chrono>
using namespace std;
using namespace chrono;
int main(void)
{
ios::sync_with_stdio(0);
cin.tie(0);
string str = "aaaa";
system_clock::time_point start = system_clock::now();
for(int i = 0; i < 100000; i++)
{
str = str + "aaaa";
}
system_clock::time_point end = system_clock::now();
microseconds microSec = duration_cast<microseconds>(end-start);
cout << microSec.count() << " ms\n";
return 0;
}
=========================== my test code. thx
In very short terms, str += "A"
will result in less assembly being generated (x86-64 gcc 11.2):
lea rax, [rbp-96]
mov esi, OFFSET FLAT:.LC1
mov rdi, rax
call std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::operator+=(char const*)
While str = str + "A"
will result in this:
lea rax, [rbp-48]
lea rcx, [rbp-96]
mov edx, OFFSET FLAT:.LC1
mov rsi, rcx
mov rdi, rax
call std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, char const*)
lea rdx, [rbp-48]
lea rax, [rbp-96]
mov rsi, rdx
mov rdi, rax
call std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::operator=(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&)
lea rax, [rbp-48]
mov rdi, rax
call std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() [complete object destructor]
I'm not an assembly guru, so don't take all I tell you here for granted because I'm not entirely sure that's how it works
The +=
case (provided by @prapin)
But in the second case, there are multiple expensive operations:
str
is createdstr
and "A" appended,str + "A"
. This now becames initial's str
referencestr
object ("cool") is destroyedLet me know in the comments if I'm mistaken
i think compiler translates str += "A" to str = str + "A" so the result have to be same.
No, operator+=
is a separate function. It's not like in Python - . That's a good thing because it allows the optimization which makes the performance difference you observed. I doubt the assembly is the same, it's two separate functions with different implementations.x += y
does not translate to x = x + y
operator+=
- No complexity, but reasonably it will be amortized O(1) per the copied character.operator+
O(n) at best obviously.In x = x+y
the right side is evaluated first and so the compiler must allocate a new string object that is then moved into x
and the old string in x
is thus wastefully discarded. Under as-if rule the compiler is free to replace x = x+y
with x+=y
but it's hard to tell when or even if it does that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With