Performance difference between string concat by using str += "A" or str = str + "A"

Question

I want to know why str += "A" and str = str + "A" have different performances.

In practice,

string str = "cool"
for(int i = 0; i < 1000000; i++) str += "A" //  -> approximately 15000~20000 ms
for(int i = 0; i < 1000000; i++) str = str + "A"  // -> 10000000 ms

According to what I've looked for, str = str + "A" have to copy 'str' so each iteration concatenation needs O(N) so if you iterate N times, time complexity is O(N^2).

However, str += "A" is only concatenating "A" to 'str' so it doesn't need to do any copy. So if iterate N times, only O(N).

I think it's weird. When I inspected the assembly code, they were the same. I think compiler translates str += "A" to str = str + "A" so the result should be the same.

If you look at the actual results, I'm wrong, but can you tell where I'm wrong?

Here is my ref link.

Here's my code for the timings:

#include<bits/stdc++.h>
#include<chrono>
using namespace std;
using namespace chrono;
int main(void)
{
    ios::sync_with_stdio(0);
    cin.tie(0);     
    string str = "aaaa";
    system_clock::time_point start = system_clock::now();
    
    for(int i = 0; i < 100000; i++)
    {
        str = str +  "aaaa";
    }

    system_clock::time_point end = system_clock::now();
    microseconds microSec = duration_cast<microseconds>(end-start);
    cout << microSec.count() << " ms
";
    
    
    return 0;
}

=========================== my test code. thx enter image description here

enter image description here

Dorin Baba · Accepted Answer

In very short terms, str += "A" will result in less assembly being generated (x86-64 gcc 11.2):

lea     rax, [rbp-96]
mov     esi, OFFSET FLAT:.LC1
mov     rdi, rax
call    std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::operator+=(char const*)

While str = str + "A" will result in this:

lea     rax, [rbp-48]
lea     rcx, [rbp-96]
mov     edx, OFFSET FLAT:.LC1
mov     rsi, rcx
mov     rdi, rax
call    std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, char const*)
lea     rdx, [rbp-48]
lea     rax, [rbp-96]
mov     rsi, rdx
mov     rdi, rax
call    std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::operator=(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&)
lea     rax, [rbp-48]
mov     rdi, rax
call    std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() [complete object destructor]

I'm not an assembly guru, so don't take all I tell you here for granted because I'm not entirely sure that's how it works

The += case (provided by @prapin)

It will just copy bytes into the string buffer if there is enough place left. Otherwise, it will reallocate the buffer with some algorithm (typically doubling it)

But in the second case, there are multiple expensive operations:

a clone of str is created
a new object is created with the value of str and "A" appended,
the reference from the 1st point now is switched to the reference of the object from 2nd point
now we got a reference to an object having the value of str + "A". This now becames initial's str reference
the initial str object ("cool") is destroyed

Let me know in the comments if I'm mistaken

Quimby · Answer

i think compiler translates str += "A" to str = str + "A" so the result have to be same.

No, operator+= is a separate function. ~~It's not like in Python - x += y does not translate to x = x + y~~. That's a good thing because it allows the optimization which makes the performance difference you observed. I doubt the assembly is the same, it's two separate functions with different implementations.

operator+= - No complexity, but reasonably it will be amortized O(1) per the copied character.
operator+ O(n) at best obviously.

In x = x+y the right side is evaluated first and so the compiler must allocate a new string object that is then moved into x and the old string in x is thus wastefully discarded. Under as-if rule the compiler is free to replace x = x+y with x+=y but it's hard to tell when or even if it does that.

Performance difference between string concat by using str += "A" or str = str + "A"

Tags:

c++

GyuMin Han

2 Answers

Dorin Baba

Quimby

Recent Activity

Donate For Us

Performance difference between string concat by using str += "A" or str = str + "A"

Tags:

c++

GyuMin Han

2 Answers

Dorin Baba

Quimby

Related questions

Recent Activity

Donate For Us