Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

(Missing) performance improvements with C++11 move semantics

Tags:

I've been writing C++11 code for quite some time now, and haven't done any benchmarking of it, only expecting things like vector operations to "just be faster" now with move semantics. So when actually benchmarking with GCC 4.7.2 and clang 3.0 (default compilers on Ubuntu 12.10 64-bit) I get very unsatisfying results. This is my test code:

EDIT: With regards to the (good) answers posted by @DeadMG and @ronag, I changed the element type from std::string to my::string which does not have a swap(), and made all inner strings larger (200-700 bytes) so that they shouldn't be the victims of SSO.

EDIT2: COW was the reason. Adapted code again by the great comments, changed the storage from std::string to std::vector<char> and leaving out copy/move onstructors (letting the compiler generate them instead). Without COW, the speed difference is actually huge.

EDIT3: Re-added the previous solution when compiled with -DCOW. This makes the internal storage a std::string rather than a std::vector<char> as requested by @chico.

#include <string> #include <vector> #include <fstream> #include <iostream> #include <algorithm> #include <functional>  static std::size_t dec = 0;  namespace my { class string { public:     string( ) { } #ifdef COW     string( const std::string& ref ) : str( ref ), val( dec % 2 ? - ++dec : ++dec ) { #else     string( const std::string& ref ) : val( dec % 2 ? - ++dec : ++dec ) {         str.resize( ref.size( ) );         std::copy( ref.begin( ), ref.end( ), str.begin( ) ); #endif     }      bool operator<( const string& other ) const { return val < other.val; }  private: #ifdef COW     std::string str; #else     std::vector< char > str; #endif     std::size_t val; }; }   template< typename T > void dup_vector( T& vec ) {     T v = vec;     for ( typename T::iterator i = v.begin( ); i != v.end( ); ++i ) #ifdef CPP11         vec.push_back( std::move( *i ) ); #else         vec.push_back( *i ); #endif }  int main( ) {     std::ifstream file;     file.open( "/etc/passwd" );     std::vector< my::string > lines;     while ( ! file.eof( ) )     {         std::string s;         std::getline( file, s );         lines.push_back( s + s + s + s + s + s + s + s + s );     }      while ( lines.size( ) < ( 1000 * 1000 ) )         dup_vector( lines );     std::cout << lines.size( ) << " elements" << std::endl;      std::sort( lines.begin( ), lines.end( ) );      return 0; } 

What this does is read /etc/passwd into a vector of lines, then duplicating this vector onto itself over and over until we have at least 1 million entries. This is where the first optimization should be useful, not only the explicit std::move() you see in dup_vector(), but also the push_back per se should perform better when it needs to resize (create new + copy) the inner array.

Finally, the vector is sorted. This should definitely be faster when you don't need to copy temporary objects each time two elements are swapped.

I compile and run this two ways, one being as C++98, the next as C++11 (with -DCPP11 for the explicit move):

1> $ rm -f a.out ; g++ --std=c++98 test.cpp ; time ./a.out 2> $ rm -f a.out ; g++ --std=c++11 -DCPP11 test.cpp ; time ./a.out 3> $ rm -f a.out ; clang++ --std=c++98 test.cpp ; time ./a.out 4> $ rm -f a.out ; clang++ --std=c++11 -DCPP11 test.cpp ; time ./a.out 

With the following results (twice for each compilation):

GCC C++98 1> real 0m9.626s 1> real 0m9.709s  GCC C++11 2> real 0m10.163s 2> real 0m10.130s 

So, it's slightly slower to run when compiled as C++11 code. Similar results goes for clang:

clang C++98 3> real 0m8.906s 3> real 0m8.750s  clang C++11 4> real 0m8.858s 4> real 0m9.053s 

Can someone tell me why this is? Are the compilers optimizing so good even when compiling for pre-C++11, that they practically reach move semantic behaviour after all? If I add -O2, all code runs faster, but the results between the different standards are almost the same as above.

EDIT: New results with my::string and rather than std::string, and larger individual strings:

$ rm -f a.out ; g++ --std=c++98 test.cpp ; time ./a.out real    0m16.637s $ rm -f a.out ; g++ --std=c++11 -DCPP11 test.cpp ; time ./a.out real    0m17.169s $ rm -f a.out ; clang++ --std=c++98 test.cpp ; time ./a.out real    0m16.222s $ rm -f a.out ; clang++ --std=c++11 -DCPP11 test.cpp ; time ./a.out real    0m15.652s 

There are very small differences between C++98 and C+11 with move semantics. Slightly slower with C++11 with GCC and slightly faster with clang, but still very small differencies.

EDIT2: Now without std::string's COW, the performance improvement is huge:

$ rm -f a.out ; g++ --std=c++98 test.cpp ; time ./a.out real    0m10.313s $ rm -f a.out ; g++ --std=c++11 -DCPP11 test.cpp ; time ./a.out real    0m5.267s $ rm -f a.out ; clang++ --std=c++98 test.cpp ; time ./a.out real    0m10.218s $ rm -f a.out ; clang++ --std=c++11 -DCPP11 test.cpp ; time ./a.out real    0m3.376s 

With optimization, the difference is a lot bigger too:

$ rm -f a.out ; g++ -O2 --std=c++98 test.cpp ; time ./a.out real    0m5.243s $ rm -f a.out ; g++ -O2 --std=c++11 -DCPP11 test.cpp ; time ./a.out real    0m0.803s $ rm -f a.out ; clang++ -O2 --std=c++98 test.cpp ; time ./a.out real    0m5.248s $ rm -f a.out ; clang++ -O2 --std=c++11 -DCPP11 test.cpp ; time ./a.out real    0m0.785s 

Above showing a factor of ~6-7 times faster with C++11.

Thanks for the great comments and answers. I hope this post will be useful and interesting to others too.

like image 877
gustaf r Avatar asked Jan 12 '13 12:01

gustaf r


People also ask

When should I use move semantics?

That's what rvalue references and move semantics are for! Move semantics allows you to avoid unnecessary copies when working with temporary objects that are about to evaporate, and whose resources can safely be taken from that temporary object and used by another.

What is c++ 11 move semantics?

In C++11, the resources of the objects can be moved from one object to another rather than copying the whole data of the object to another. This can be done by using move semantics in C++11. Move semantics points the other object to the already existing object in the memory.


2 Answers

This should definitely be faster when you don't need to copy temporary objects each time two elements are swapped.

std::string has a swap member, so sort will already use that, and it's internal implementation will already be move semantics, effectively. And you won't see a difference between copy and move for std::string as long as SSO is involved. In addition, some versions of GCC still have a non-C++11-permitted COW-based implementation, which also would not see much difference between copy and move.

like image 98
Puppy Avatar answered Oct 09 '22 11:10

Puppy


This is probably due to the small string optimization, which can occur (depending on the compiler) for strings shorter than e.g 16 characters. I would guess that all the lines in the file are quite short, since they are passwords.

When small string optimization is active for a particular string then move is done as a copy.

You will need to have larger strings to see any speed improvements with move semantics.

like image 42
ronag Avatar answered Oct 09 '22 13:10

ronag