Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

No small string optimization with gcc?

Tags:

Most std::string implementations (GCC included) use small string optimization. E.g. there's an answer discussing this.

Today, I decided to check at what point a string in a code I compile gets moved to the heap. To my surprise, my test code seems to show that no small string optimization occurs at all!

Code:

#include <iostream>
#include <string>

using std::cout;
using std::endl;

int main(int argc, char* argv[]) {
  std::string s;

  cout << "capacity: " << s.capacity() << endl;

  cout << (void*)s.c_str() << " | " << s << endl;
  for (int i=0; i<33; ++i) {
    s += 'a';
    cout << (void*)s.c_str() << " | " << s << endl;
  }

}

The output of g++ test.cc && ./a.out is

capacity: 0
0x7fe405f6afb8 | 
0x7b0c38 | a
0x7b0c68 | aa
0x7b0c38 | aaa
0x7b0c38 | aaaa
0x7b0c68 | aaaaa
0x7b0c68 | aaaaaa
0x7b0c68 | aaaaaaa
0x7b0c68 | aaaaaaaa
0x7b0c98 | aaaaaaaaa
0x7b0c98 | aaaaaaaaaa
0x7b0c98 | aaaaaaaaaaa
0x7b0c98 | aaaaaaaaaaaa
0x7b0c98 | aaaaaaaaaaaaa
0x7b0c98 | aaaaaaaaaaaaaa
0x7b0c98 | aaaaaaaaaaaaaaa
0x7b0c98 | aaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
0x7b0cd8 | aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
0x7b0d28 | aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

I'm guessing that the larger first pointer, i.e. 0x7fe405f6afb8 is a stack pointer, and the other ones point to the heap. Running this many times produces identical results, in the sense that the first address is always large, and the other ones are smaller; the exact values usually differ. The smaller addresses always follow the standard power of 2 allocation scheme, e.g. 0x7b0c38 is listed once, then 0x7b0c68 is listed once, then 0x7b0c38 twice, then 0x7b0c68 4 times, then 0x7b0c98 8 times, etc.

After reading Howard's answer, using a 64bit machine, I was expecting to see the same address printed for the first 22 characters, and only then to see it change.

Am I missing something?

Also, interestingly, if I compile with -O (at any level), I get a constant small pointer value 0x6021f8 in the first case, instead of the large value, and this 0x6021f8 doesn't change regardless of how many times I run the program.

Output of g++ -v:

Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/foo/bar/gcc-6.2.0/gcc/libexec/gcc/x86_64-redhat-linux/6.2.0/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../gcc-6.2.0/configure --prefix=/foo/bar/gcc-6.2.0/gcc --build=x86_64-redhat-linux --disable-multilib --enable-languages=c,c++,fortran --with-default-libstdcxx-abi=gcc4-compatible --enable-bootstrap --enable-threads=posix --with-long-double-128 --enable-long-long --enable-lto --enable-__cxa_atexit --enable-gnu-unique-object --with-system-zlib --enable-gold
Thread model: posix
gcc version 6.2.0 (GCC)
like image 641
SU3 Avatar asked Sep 06 '17 06:09

SU3


People also ask

What is small string optimization?

In several implementations, including the Visual C++'s one, the STL string classes are empowered by an interesting optimization: The Small String Optimization (SSO). What does that mean? Well, it basically means that small strings get a special treatment.

How do I disable GCC optimization?

Use the command-line option -O0 (-[capital o][zero]) to disable optimization, and -S to get assembly file.

What is default GCC optimization level?

GCC has a range of optimization levels, plus individual options to enable or disable particular optimizations. The overall compiler optimization level is controlled by the command line option -On, where n is the required optimization level, as follows: -O0 . (default).

How do I know if GCC is not optimized?

Compiler specific pragma gcc provides pragma GCC as a way to control temporarily the compiler behavior. By using pragma GCC optimize("O0") , the optimization level can be set to zero, which means absolutely no optimize for gcc.

What is “small string optimization”?

What is “small string optimization”? Standard C++ string stores its data on the heap. But that is only true if the string grows over an implementation-dependent size. That predefined size for std::string is/was 15 for MSVC and GCC and 23 for Clang.

How does GCC control the amount of optimization?

In some places, GCC uses various constants to control the amount of optimization that is done. For example, GCC does not inline functions that contain more than a certain number of instructions.

What is-O and-optimize in GCC?

Optimize even more. GCC performs nearly all supported optimizations that do not involve a space-speed tradeoff. As compared to -O, this option increases both compilation time and the performance of the generated code.

What are the limitations of GCC?

This limits the number of iterations or recursive calls GCC performs when optimizing certain statements or when determining their validity prior to issuing diagnostics. Maximum size of a single store merging region in bytes. The number of elements for which hash table verification is done for each searched element.


1 Answers

One of your flags is:

--with-default-libstdcxx-abi=gcc4-compatible

and GCC4 does not support small string optimzation.


GCC5 started supporting it. isocpp states:

A new implementation of std::string is enabled by default, using the small string optimization instead of copy-on-write reference counting.

which supports my claim.

Moreover, Exploring std::string mentions:

As we see, older libstdc++ implements copy-on-write, and so it makes sense for them to not utilize small objects optimization.

and then he changes context, when GCC5 comes in play.

like image 50
gsamaras Avatar answered Sep 20 '22 10:09

gsamaras