Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How is std::string implemented?

I am curious to know how std::string is implemented and how does it differ from c string?If the standard does not specify any implementation then any implementation with explanation would be great with how it satisfies the string requirement given by standard?

like image 491
yesraaj Avatar asked Sep 23 '09 13:09

yesraaj


People also ask

How is string implemented?

a "short string optimization" (SSO) implementation. In this variant, the object contains the usual pointer to data, length, size of the dynamically allocated buffer, etc. But if the string is short enough, it will use that area to hold the string instead of dynamically allocating a buffer.

How is string class implemented C++?

std::vector is part of C++ (and has been for a long time), so this certainly is "implement[ing] a string class in C++ ..." (and does not use std::string ). class String final { std::vector<char> buffer; public: String() = default; String(const char* chars){ auto begin = chars; auto end = begin + strlen(chars); buffer.

How does std::string work?

std::string class in C++ C++ has in its definition a way to represent a sequence of characters as an object of the class. This class is called std:: string. String class stores the characters as a sequence of bytes with the functionality of allowing access to the single-byte character.

Is std::string allocated on stack or heap?

So we can say std::string allocates short strings on the stack, but long ones -- on the heap.


1 Answers

Virtually every compiler I've used provides source code for the runtime - so whether you're using GCC or MSVC or whatever, you have the capability to look at the implementation. However, a large part or all of std::string will be implemented as template code, which can make for very difficult reading.

Scott Meyer's book, Effective STL, has a chapter on std::string implementations that's a decent overview of the common variations: "Item 15: Be aware of variations in string implementations".

He talks about 4 variations:

  • several variations on a ref-counted implementation (commonly known as copy on write) - when a string object is copied unchanged, the refcount is incremented but the actual string data is not. Both object point to the same refcounted data until one of the objects modifies it, causing a 'copy on write' of the data. The variations are in where things like the refcount, locks etc are stored.

  • a "short string optimization" (SSO) implementation. In this variant, the object contains the usual pointer to data, length, size of the dynamically allocated buffer, etc. But if the string is short enough, it will use that area to hold the string instead of dynamically allocating a buffer

Also, Herb Sutter's "More Exceptional C++" has an appendix (Appendix A: "Optimizations that aren't (in a Multithreaded World)") that discusses why copy on write refcounted implementations often have performance problems in multithreaded applications due to synchronization issues. That article is also available online (but I'm not sure if it's exactly the same as what's in the book):

  • http://www.gotw.ca/publications/optimizations.htm

Both those chapters would be worthwhile reading.

like image 191
Michael Burr Avatar answered Sep 23 '22 21:09

Michael Burr