The MSDN article, How to: Write a Move Constuctor, has the following recommendation. <blockquote> If you provide both a move constructor and a move assignment operator for your class, you can eliminate redundant code by writing the move constructor to call the move assignment operator. The following example shows a revised version of the move constructor that calls the move assignment operator: </blockquote> <pre class="prettyprint"><code>// Move constructor. MemoryBlock(MemoryBlock&& other) : _data(NULL) , _length(0) { *this = std::move(other); } </code></pre> Is this code inefficient by doubly initializing <code>MemoryBlock</code>'s values, or will the compiler be able to optimize away the extra initializations? Should I always write my move constructors by calling the move assignment operator?

<blockquote> [...] will the compiler be able to optimize away the extra initializations? </blockquote> In almost all cases: yes. <blockquote> Should I always write my move constructors by calling the move assignment operator? </blockquote> Yes, just implement it via move assignment operator, except in the cases where you measured that it leads to suboptimal performance. <hr> Today's optimizer do an incredible job at optimizing code. Your example code is especially easy to optimize. First of all: the move constructor will be inlined in almost all cases. If you implement it via move assignment operator, that one will be inlined as well. And let's look at some assembly! This shows the exact code from the Microsoft website with both versions of the move constructor: manual and via move assignment. Here is the assembly output for GCC with <code>-O</code> (<code>-O1</code> has the same output; clang's output leads to the same conclusion): <pre class="prettyprint lang-none prettyprint-override"><code>; ===== manual version ===== | ; ===== via move-assig ===== MemoryBlock(MemoryBlock&&): | MemoryBlock(MemoryBlock&&): mov QWORD PTR [rdi], 0 | mov QWORD PTR [rdi], 0 mov QWORD PTR [rdi+8], 0 | mov QWORD PTR [rdi+8], 0 | cmp rdi, rsi | je .L1 mov rax, QWORD PTR [rsi+8] | mov rax, QWORD PTR [rsi+8] mov QWORD PTR [rdi+8], rax | mov QWORD PTR [rdi+8], rax mov rax, QWORD PTR [rsi] | mov rax, QWORD PTR [rsi] mov QWORD PTR [rdi], rax | mov QWORD PTR [rdi], rax mov QWORD PTR [rsi+8], 0 | mov QWORD PTR [rsi+8], 0 mov QWORD PTR [rsi], 0 | mov QWORD PTR [rsi], 0 | .L1: ret | rep ret </code></pre> Apart from the additional branch for the right version, the code is exactly the same. Meaning: duplicate assignments have been removed. Why the additional branch? The move assignment operator as defined by the Microsoft page does more work than the move constructor: it is protected against self-assignment. The move constructor is not protected against that. But: as I already said, the constructor will be inlined in almost all cases. And in these cases, the optimizer can see that it's not a self assignment, so this branch will be optimized out, too. <hr> This get's repeated a lot, but it's important: don't do premature micro-optimization! Don't get me wrong, I also hate software that wastes a lot of resources due to lazy or sloppy developers or management decisions. And saving energy is not just about batteries, but also an environmental topic, which I am very passionate about. But, doing micro-optimizations prematurely doesn't help in that regard! Sure, keep the algorithmic complexity and cache friendliness of your large data in the back of your head. But before you do any specific optimization, measure! In this specific case, I would even guess that you will never have to hand optimize, because the compiler will always be able to generate optimal code around your move constructor. Doing the useless micro-optimization now will cost you development time later when you need to change code in two places or when you need to debug a strange error that only happens because you changed code in only one place. And that is wasted development time that could rather be spent doing useful optimizations.

I wouldn't do it this way. The reason for the move members to exist in the first place is performance. Doing this for your move constructor is like shelling out megabucks for a super-car and then trying to save money by buying regular gas. If you want to reduce the amount of code you write, just don't write the move members. Your class will copy just fine in a move context. If you want your code to be high performance, then tailor your move constructor and move assignment to be as fast as possible. Good move members will be blazingly fast, and you should be estimating their speed by counting loads, stores and branches. If you can write something with 4 load/stores instead of 8, do it! If you can write something with no branches instead of 1, do it! When you (or your client) put your class into a <code>std::vector</code>, a lot of moves can get generated on your type. Even if your move is lightning fast at 8 loads/stores, if you can make it twice as fast, or even 50% faster with only 4 or 6 loads/stores, imho that is time well spent. Personally I'm sick of seeing waiting cursors and am willing to donate an extra 5 minutes to writing my code and know that it is as fast as possible. If you're still not convinced this is worth it, write it both ways and then examine the generated assembly at full optimization. Who knows, your compiler just might be smart enough to optimize away extra loads and stores for you. But by this time you've already invested more time than if you had just written an optimized move constructor in the first place.

Implementing Move Constructor by Calling Move Assignment Operator

Tags:

The MSDN article, How to: Write a Move Constuctor, has the following recommendation.

If you provide both a move constructor and a move assignment operator for your class, you can eliminate redundant code by writing the move constructor to call the move assignment operator. The following example shows a revised version of the move constructor that calls the move assignment operator:

// Move constructor. MemoryBlock(MemoryBlock&& other)    : _data(NULL)    , _length(0) {    *this = std::move(other); }

Is this code inefficient by doubly initializing MemoryBlock's values, or will the compiler be able to optimize away the extra initializations? Should I always write my move constructors by calling the move assignment operator?

455

asked Jun 14 '13 22:06

Elliot Hatch

2 Answers

[...] will the compiler be able to optimize away the extra initializations?

In almost all cases: yes.

Should I always write my move constructors by calling the move assignment operator?

Yes, just implement it via move assignment operator, except in the cases where you measured that it leads to suboptimal performance.

Today's optimizer do an incredible job at optimizing code. Your example code is especially easy to optimize. First of all: the move constructor will be inlined in almost all cases. If you implement it via move assignment operator, that one will be inlined as well.

And let's look at some assembly! This shows the exact code from the Microsoft website with both versions of the move constructor: manual and via move assignment. Here is the assembly output for GCC with -O (-O1 has the same output; clang's output leads to the same conclusion):

; ===== manual version =====           |   ; ===== via move-assig ===== MemoryBlock(MemoryBlock&&):            |   MemoryBlock(MemoryBlock&&):     mov     QWORD PTR [rdi], 0         |       mov     QWORD PTR [rdi], 0     mov     QWORD PTR [rdi+8], 0       |       mov     QWORD PTR [rdi+8], 0                                        |       cmp     rdi, rsi                                        |       je      .L1     mov     rax, QWORD PTR [rsi+8]     |       mov     rax, QWORD PTR [rsi+8]     mov     QWORD PTR [rdi+8], rax     |       mov     QWORD PTR [rdi+8], rax     mov     rax, QWORD PTR [rsi]       |       mov     rax, QWORD PTR [rsi]     mov     QWORD PTR [rdi], rax       |       mov     QWORD PTR [rdi], rax     mov     QWORD PTR [rsi+8], 0       |       mov     QWORD PTR [rsi+8], 0     mov     QWORD PTR [rsi], 0         |       mov     QWORD PTR [rsi], 0                                        |   .L1:     ret                                |       rep ret

Apart from the additional branch for the right version, the code is exactly the same. Meaning: duplicate assignments have been removed.

Why the additional branch? The move assignment operator as defined by the Microsoft page does more work than the move constructor: it is protected against self-assignment. The move constructor is not protected against that. But: as I already said, the constructor will be inlined in almost all cases. And in these cases, the optimizer can see that it's not a self assignment, so this branch will be optimized out, too.

This get's repeated a lot, but it's important: don't do premature micro-optimization!

Don't get me wrong, I also hate software that wastes a lot of resources due to lazy or sloppy developers or management decisions. And saving energy is not just about batteries, but also an environmental topic, which I am very passionate about. But, doing micro-optimizations prematurely doesn't help in that regard! Sure, keep the algorithmic complexity and cache friendliness of your large data in the back of your head. But before you do any specific optimization, measure!

In this specific case, I would even guess that you will never have to hand optimize, because the compiler will always be able to generate optimal code around your move constructor. Doing the useless micro-optimization now will cost you development time later when you need to change code in two places or when you need to debug a strange error that only happens because you changed code in only one place. And that is wasted development time that could rather be spent doing useful optimizations.

154

answered Oct 11 '22 08:10

Lukas Kalbertodt

I wouldn't do it this way. The reason for the move members to exist in the first place is performance. Doing this for your move constructor is like shelling out megabucks for a super-car and then trying to save money by buying regular gas.

If you want to reduce the amount of code you write, just don't write the move members. Your class will copy just fine in a move context.

If you want your code to be high performance, then tailor your move constructor and move assignment to be as fast as possible. Good move members will be blazingly fast, and you should be estimating their speed by counting loads, stores and branches. If you can write something with 4 load/stores instead of 8, do it! If you can write something with no branches instead of 1, do it!

When you (or your client) put your class into a std::vector, a lot of moves can get generated on your type. Even if your move is lightning fast at 8 loads/stores, if you can make it twice as fast, or even 50% faster with only 4 or 6 loads/stores, imho that is time well spent.

Personally I'm sick of seeing waiting cursors and am willing to donate an extra 5 minutes to writing my code and know that it is as fast as possible.

If you're still not convinced this is worth it, write it both ways and then examine the generated assembly at full optimization. Who knows, your compiler just might be smart enough to optimize away extra loads and stores for you. But by this time you've already invested more time than if you had just written an optimized move constructor in the first place.

answered Oct 11 '22 09:10

Howard Hinnant

Related questions
                            
                                Convert image to icon in c#
                            
                                "NS_ERROR_DOM_BAD_URI: Access to restricted URI denied"
                            
                                Open source github like web interface [closed]
                            
                                How to aggregate by year-month-day on a different timezone
                            
                                Depending on a local package in cabal
                            
                                What is a Java 8 "view"?
                            
                                What is the meaning of Python's philosophy "never is often better than *right* now" [closed]
                            
                                git remove a commit from pull request
                            
                                Gradle builds really slow with a multi-project structure
                            
                                How does three-phase commit avoid blocking?
                            
                                Is it possible to initialize std::vector over already allocated memory?
                            
                                Using grequests to make several thousand get requests to sourceforge, get "Max retries exceeded with url"

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With