I am curious to know how std::string is implemented and how does it differ from c string?If the standard does not specify any implementation then any implementation with explanation would be great with how it satisfies the string requirement given by standard?
a "short string optimization" (SSO) implementation. In this variant, the object contains the usual pointer to data, length, size of the dynamically allocated buffer, etc. But if the string is short enough, it will use that area to hold the string instead of dynamically allocating a buffer.
std::vector is part of C++ (and has been for a long time), so this certainly is "implement[ing] a string class in C++ ..." (and does not use std::string ). class String final { std::vector<char> buffer; public: String() = default; String(const char* chars){ auto begin = chars; auto end = begin + strlen(chars); buffer.
std::string class in C++ C++ has in its definition a way to represent a sequence of characters as an object of the class. This class is called std:: string. String class stores the characters as a sequence of bytes with the functionality of allowing access to the single-byte character.
So we can say std::string allocates short strings on the stack, but long ones -- on the heap.
Virtually every compiler I've used provides source code for the runtime - so whether you're using GCC or MSVC or whatever, you have the capability to look at the implementation. However, a large part or all of std::string
will be implemented as template code, which can make for very difficult reading.
Scott Meyer's book, Effective STL, has a chapter on std::string implementations that's a decent overview of the common variations: "Item 15: Be aware of variations in string
implementations".
He talks about 4 variations:
several variations on a ref-counted implementation (commonly known as copy on write) - when a string object is copied unchanged, the refcount is incremented but the actual string data is not. Both object point to the same refcounted data until one of the objects modifies it, causing a 'copy on write' of the data. The variations are in where things like the refcount, locks etc are stored.
a "short string optimization" (SSO) implementation. In this variant, the object contains the usual pointer to data, length, size of the dynamically allocated buffer, etc. But if the string is short enough, it will use that area to hold the string instead of dynamically allocating a buffer
Also, Herb Sutter's "More Exceptional C++" has an appendix (Appendix A: "Optimizations that aren't (in a Multithreaded World)") that discusses why copy on write refcounted implementations often have performance problems in multithreaded applications due to synchronization issues. That article is also available online (but I'm not sure if it's exactly the same as what's in the book):
Both those chapters would be worthwhile reading.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With