Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does libc++'s implementation of std::string take up 3x memory as libstdc++?

Tags:

Consider the following test program:

#include <iostream>
#include <string>
#include <vector>

int main()
{
    std::cout << sizeof(std::string("hi")) << " ";
    std::string a[10];
    std::cout << sizeof(a) << " ";
    std::vector<std::string> v(10);
    std::cout << sizeof(v) + sizeof(std::string) * v.capacity() << "\n";
}

Output for libstdc++ and libc++ respectively are:

8 80 104
24 240 264

As you can see, libc++ takes 3 times as much memory for a simple program. How does the implementation differ that causes this memory disparity? Do I need to be concerned and how do I workaround it?

like image 885
user4390444 Avatar asked Dec 24 '14 03:12

user4390444


People also ask

How many different memory layouts for string class in libc++?

Here are some observations after looking at the source: libc++ can be compiled with two slightly different memory layouts for the string class, this is governed by the _LIBCPP_ALTERNATE_STRING_LAYOUT flag. Both of the layouts also distinguish between little-endian and big-endian machines which leaves us with a total of 4 different variants.

Why is libc++ so hard to read?

libc++ is the LLVM project’s implementation of the C++ standard library. libc++’s implementation of std::string is a fascinating case study of how to optimize container classes. Unfortunately, the source code is very hard to read because it is extremely: Optimized. Even for relatively niche use-cases.

Is fbstring better than libstdc?

In both small and large implementations of fbstring, the gap between folly and libstd is quite large, approaching 2x. The gap is much narrower in medium find (), however. In conclusion, fbstring beat or narrowly lost to libstdc in every test, which validates folly’s bold claims.

What is libc++ and why should I Care?

libc++ is the LLVM project’s implementation of the C++ standard library. libc++’s implementation of std::string is a fascinating case study of how to optimize container classes. Unfortunately, the source code is very hard to read because it is extremely:


1 Answers

Here is a short program to help you explore both kinds of memory usage of std::string: stack and heap.

#include <string>
#include <new>
#include <cstdio>
#include <cstdlib>

std::size_t allocated = 0;

void* operator new (size_t sz)
{
    void* p = std::malloc(sz);
    allocated += sz;
    return p;
}

void operator delete(void* p) noexcept
{
    return std::free(p);
}

int
main()
{
    allocated = 0;
    std::string s("hi");
    std::printf("stack space = %zu, heap space = %zu, capacity = %zu\n",
     sizeof(s), allocated, s.capacity());
}

Using http://melpon.org/wandbox/ it is easy to get output for different compiler/lib combinations, for example:

gcc 4.9.1:

stack space = 8, heap space = 27, capacity = 2

gcc 5.0.0:

stack space = 32, heap space = 0, capacity = 15

clang/libc++:

stack space = 24, heap space = 0, capacity = 22

VS-2015:

stack space = 32, heap space = 0, capacity = 15

(the last line is from http://webcompiler.cloudapp.net)

The above output also shows capacity, which is a measure of how many chars the string can hold before it has to allocate a new, larger buffer from the heap. For the gcc-5.0, libc++, and VS-2015 implementations, this is a measure of the short string buffer. That is, the size buffer allocated on the stack to hold short strings, thus avoiding the more expensive heap allocation.

It appears that the libc++ implementation has the smallest (stack usage) of the short-string implementations, and yet contains the largest of the short string buffers. And if you count total memory usage (stack + heap), libc++ has the smallest total memory usage for this 2-character string among all 4 of these implementations.

It should be noted that all of these measurements were taken on 64 bit platforms. On 32 bit, the libc++ stack usage will go down to 12, and the small string buffer goes down to 10. I don't know the behavior of the other implementations on 32 bit platforms, but you can use the above code to find out.

like image 51
Howard Hinnant Avatar answered Nov 14 '22 19:11

Howard Hinnant