Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does const ref lvalue to non-const func return value specifically reduce copies?

I have encountered a C++ habit that I have tried to research in order to understand its impact and validate its usage. But I can't seem to find the exact answer.

std::vector< Thing > getThings();

void do() {
    const std::vector< Thing > &things = getThings();   
}

Here we have some function that returns a non-const& value. The habit I am seeing is the usage of a const& lvalue when assigning the return value from the function. The proposed reasoning for this habit is that it reduces a copy.

Now I have been researching RVO (Return Value Optimization), copy elision, and C++11 move semantics. I realize that a given compiler could choose to prevent a copy via RVO regardless of the use of const& here. But does the usage of a const& lvalue here have any kind of effect on non-const& return values in terms of preventing copies? And I am specifically asking about pre-C++11 compilers, before move semantics.

My assumption is that either the compiler implements RVO or it does not, and that saying the lvalue should be const& doesn't hint or force a copy-free situation.

Edit

I am specifically asking about whether const& usage here reduces a copy, and not about the lifetime of the temporary object, as described in "the most important const"

Further clarification of question

Is this:

const std::vector< Thing > &things = getThings();

any different than this:

std::vector< Thing > things = getThings();

in terms of reducing copies? Or does it not have any influence on whether the compiler can reduce copies, such as via RVO?

like image 262
jdi Avatar asked Oct 19 '22 04:10

jdi


2 Answers

Semantically, the compiler needs an accessible copy-constructor, at the call site, even if later on, the compiler elides the call to the copy-constructor — that optimization is done later in the compilation phase after the semantic-analysis phase.

After reading your comments, I think I understand your question better. Now let me answer it in detail.

Imagine that the function has this return statement:

return items;

Semantically speaking, the compiler needs an accessible copy-constructor (or move-constructor) here, which can be elided. However, just for the sake of argument, assume that it makes a copy here and the copy is stored in __temp_items which I expressed this as:

__temp_items <= return items; //first copy: 

Now at the call site, assume that you have not used const &, so it becomes this:

std::vector<Thing> things = __temp_items;  //second copy

Now as you can see yourself, there are two copies. Compilers are allowed to elide both of them.

However, your actual code uses const &, so it becomes this:

const std::vector<Thing> & things = __temp_items;  //no copy anymore.

Now, semantically there is only one copy, which can still be elided by the compiler. As for the second copy, I wont say const& "prevented" it in the sense that compiler has optimised it, rather it is not allowed by the language to begin with.


But interestingly, no matter how many times the compiler makes copies while returning, or elides few (or all) of them, the return value is a temporary. If that is so, then how does binding to a temporary work? If that is also your question (now I know that is not your question but then keep it that way so that I dont have to erase this part of my answer), then yes it works and that is guaranteed by the language.

As explained in the article the most imporant const in very detail, that if a const reference binds to a temporary, then the lifetime of the temporary is extended till the scope of the reference, and it is irrespective of the type of the object.

In C++11, there is another way to extend the lifetime of a temporary, which is rvalue-reference:

std::vector<Thing> && things = getThings();    

It has the same effect, but the advantage (or disadvantage — depends on the context) is that you can also modify the content.

I personally prefer to write this as:

auto && things = getThings();   

but then that is not necessarily a rvalue-reference — if you change the return type of the function, to return a reference, then things turns out to bind to lvalue-reference. If you want to discuss that, then that is a whole different topic.

like image 181
Nawaz Avatar answered Oct 21 '22 03:10

Nawaz


Hey so your question is:

"When a function returns a class instance by value, and you assign it to a const reference, does that avoid a copy constructor call?"

Ignoring the lifetime of the temporary, as that’s not the question you’re asking, we can get a feel for what happens by looking at the assembly output. I’m using clang, llvm 7.0.2.

Here’s something box standard. Return by value, nothing fancy.

Test A

class MyClass
{
public:
    MyClass();
    MyClass(const MyClass & source);
    long int m_tmp;
};

MyClass createMyClass();

int main()
{
    const MyClass myClass = createMyClass();
    return 0;
}

If I compile with “-O0 -S -fno-elide-constructors” I get this.

_main:
    pushq   %rbp                    # Boiler plate
    movq    %rsp, %rbp              # Boiler plate
    subq    $32, %rsp               # Reserve 32 bytes for stack frame
    leaq    -24(%rbp), %rdi         # arg0 = &___temp_items = rdi = rbp-24
    movl    $0, -4(%rbp)            # rbp-4 = 0, no idea why this happens
    callq   __Z13createMyClassv     # createMyClass(arg0)
    leaq    -16(%rbp), %rdi         # arg0 = & myClass
    leaq    -24(%rbp), %rsi         # arg1 = &__temp_items
    callq   __ZN7MyClassC1ERKS_     # MyClass::MyClass(arg0, arg1)
    xorl    %eax, %eax              # eax = 0, the return value for main
    addq    $32, %rsp               # Pop stack frame
    popq    %rbp                    # Boiler plate
    retq

We are looking at only the calling code. We’re not interested in the implementation of createMyClass. That’s compiled somewhere else. So createMyClass creates the class inside a temporary and then that gets copied into myClass.

Simples.

What about the const ref version ?

Test B

class MyClass
{
public:
    MyClass();
    MyClass(const MyClass & source);
    long int m_tmp;
};

MyClass createMyClass();

int main()
{
    const MyClass & myClass = createMyClass();
    return 0;
}

Same compiler options.

_main:                              # Boiler plate
    pushq   %rbp                    # Boiler plate
    movq    %rsp, %rbp              # Boiler plate
    subq    $32, %rsp               # Reserve 32 bytes for the stack frame
    leaq    -24(%rbp), %rdi         # arg0 = &___temp_items = rdi = rbp-24
    movl    $0, -4(%rbp)            # *(rbp-4) = 0, no idea what this is for
    callq   __Z13createMyClassv     # createMyClass(arg0)
    xorl    %eax, %eax              # eax = 0, the return value for main
    leaq    -24(%rbp), %rdi         # rdi = &___temp_items
    movq    %rdi, -16(%rbp)         # &myClass = rdi = &___temp_items;
    addq    $32, %rsp               # Pop stack frame
    popq    %rbp                    # Boiler plate
    retq

No copy constructor and therefore more optimal right ?

What happens if we turn off “-fno-elide-constructors” for both versions? Still keeping -O0.

Test A

_main:
    pushq   %rbp                    # Boiler plate
    movq    %rsp, %rbp              # Boiler plate
    subq    $16, %rsp               # Reserve 16 bytes for the stack frame
    leaq    -16(%rbp), %rdi         # arg0 = &myClass = rdi = rbp-16
    movl    $0, -4(%rbp)            # rbp-4 = 0, no idea what this is
    callq   __Z13createMyClassv     # createMyClass(arg0)
    xorl    %eax, %eax              # eax = 0, return value for main
    addq    $16, %rsp               # Pop stack frame
    popq    %rbp                    # Boiler plate
    retq

Clang has removed the copy constructor call.

Test B

_main:                              # Boiler plate
    pushq   %rbp                    # Boiler plate
    movq    %rsp, %rbp              # Boiler plate
    subq    $32, %rsp               # Reserve 32 bytes for the stack frame
    leaq    -24(%rbp), %rdi         # arg0 = &___temp_items = rdi = rbp-24
    movl    $0, -4(%rbp)            # rbp-4 = 0, no idea what this is
    callq   __Z13createMyClassv     # createMyClass(arg0)
    xorl    %eax, %eax              # eax = 0, return value for main
    leaq    -24(%rbp), %rdi         # rdi = &__temp_items
    movq    %rdi, -16(%rbp)         # &myClass = rdi
    addq    $32, %rsp               # Pop stack frame
    popq    %rbp                    # Boiler plate
    retq

Test B (assign to const reference) is the same as it was before. It now has more instructions than Test A.

What if we set optimisation to -O1 ?

_main:
    pushq   %rbp                    # Boiler plate
    movq    %rsp, %rbp              # Boiler plate
    subq    $16, %rsp               # Reserve 16 bytes for the stack frame
    leaq    -8(%rbp), %rdi          # arg0 = &___temp_items = rdi = rbp-8
    callq   __Z13createMyClassv     # createMyClass(arg0)
    xorl    %eax, %eax              # ex = 0, return value for main
    addq    $16, %rsp               # Pop stack frame
    popq    %rbp                    # Boiler plate
    retq

Both source files turn into this when compiled with -O1. They result in exactly the same assembler. This is also true for -O4.

The compiler doesn’t know about the contents of createMyClass so it can’t do anything more to optimise.

With the compiler I'm using, you get no performance gain from assigning to a const ref.

I imagine it's a similar situation for g++ and intel although it's always good to check.

like image 31
Luke Avatar answered Oct 21 '22 04:10

Luke