What actually is done when `string::c_str()` is invoked?

Tags:

What actually is done when string::c_str() is invoked?

string::c_str() will allocate memory, copy the internal data of the string object and append a null-terminated character to the newly allocated memory?

Since string::c_str() must be O(1), so allocating memory and copying the string over is no longer allowed. In practice having the null-terminator there all the time is the only sane implementation.

Somebody in the comments of this answer of this question says that C++11 requires that std::string allocate an extra char for a trailing '\0'. So it seems the second option is possible.

And another person says that std::string operations - e.g. iteration, concatenation and element mutation - don't need the zero terminator. Unless you pass the string to a function expecting a zero terminated string, it can be omitted.

And more voice from an expert:

Why is it common for implementers to make .data() and .c_str() do the same thing?

Because it is more efficient to do so. The only way to make .data() return something that is not null terminated, would be to have .c_str() or .data() copy their internal buffer, or to just use 2 buffers. Having a single null terminated buffer always means that you can always use just one internal buffer when implementing std::string.

So I am really confused now, what actually is done when string::c_str() is invoked?

Update:

If c_str() is implemented as simply returning the pointer it's already allocated and managed.

A. Since c_str() must be null-terminated, the internal buffer needs to be always be null-terminated, even if for an empty std::string, e.g: std::string demo_str;, there should be a \0 in the internal memory of demo_str. Am I right?

B.What would happen when std::string::substr() is invoked? Automactically append a \0 to sub-string?

618

asked Sep 25 '21 04:09

John

3 Answers

Since C++11, std::string::c_str() and std::string::data() are both required to return a pointer to the string's internal buffer. And since c_str() (but not data()) must be null-terminated, that effectively requires the internal buffer to always be null-terminated, though the null terminator is not counted by size()/length(), or returned by std::string iterators, etc.

Prior to C++11, the behavior of c_str() was technically implementation-specific, but most implementations I've ever seen worked this way, as it is the simplest and sanest way to implement it. C++11 just standardized the behavior that was already in wide use.

UPDATE

Since C++11, the buffer is always null-terminated, even for an empty string. However, that does not mean the buffer is required to be dynamically allocated when the string is empty. It could point to an SSO buffer, or even to a single static nul character. There is no guarantee that the pointer returned by c_str()/data() remains pointing at the same memory address as the content of the string changes.

std::string::substr() returns a new std::string with its own null-terminated buffer. The string being copied from is unaffected.

answered Oct 10 '22 12:10

Remy Lebeau

Here is an empirical "proof" that the complexity of .c_str() is o(1):

Click to copy

#include <stdio.h>
#include <string>
using namespace std;
int main(int argc, char **argv)
{
    std::string x(5000000, 'b'); // <--- single time allocation
    // std::string x(5, 'b'); // <--- compare to a much shorter string
    for (unsigned int i=0;i<1000000;i++)
    {
        const char *y = x.c_str(); // <--- copy entire content ?
    }
}

compiled with -O0 to avoid optimizing out anything
timing 2 versions: I get identical performance
this is an empirical "proof" that (at least my machine's implementation)
- extracts the internal representation of a null terminated string
- doesn't copy content every time .c_str() is called.

answered Oct 10 '22 12:10

OrenIshShalom

There's a lot of great answers and comments already provided. But to demonstrate that std::string is typically backed by a null terminated string, I've provided a simple, yet naive implementation. It's not complete, doesn't do error checking, and is certainly not optimized. But it's complete enough to show you how a string class is typically implemented with a null terminated buffer as a member variable.

Click to copy

class string
{
public:

    string()
    {
        assign("", 0);
    }

    string(const char* s)
    {
        assign(s, strlen(s));
    }

    string(const char* s, size_t len)
    {
        assign(s, len);
    }

    string(const string& s)
    {
        assign(s._ptr, s._len);
    }

    ~string()
    {
       delete [] _ptr;
    }

    string& operator=(const string& s)
    {
        const char* oldptr = _ptr;
        assign(s._ptr, s._len);
        delete [] oldptr;
    }

    const char* data()
    {
        return _ptr;
    }

    const char* c_str()
    {
       return _ptr;
    }

    size_t length()
    {
        return _len;
    }

    // substr always returns a new string
    std::string substr(size_t pos, size_t count)
    {
        std::string s(_ptr+pos, count);
        return s;  
    }

private:
    char* _ptr;
    size_t _len;

    void assign(const char* ptr, size_t len)
    {
        _len = len;        
        _ptr = new char[_len+1]; // +1 for null termination
        memcpy(_ptr, ptr, len); 
        _ptr[_len] = '\0';       // always null terminate
    }
};

answered Oct 10 '22 12:10

selbie

Related questions
                            
                                Can someone explain { } container in c++
                            
                                push_back is more efficient than emplace_back?
                            
                                "does not name a type" error when using namespaces in c++
                            
                                What does comparing the result of the three-way comparison operator with nullptr do?
                            
                                Simplest way to assign std::span to std::vector
                            
                                C++ - String capacity pattern
                            
                                Is it guaranteed that std::chrono::steady_clock never wraps around?
                            
                                How to understand the definition of "manifestly constant-evaluated"?
                            
                                How do I delete a pure virtual function inherited from base class?
                            
                                How many variables can be in local scope
                            
                                Import std lib as modules with clang
                            
                                Why does `iota(0) | take(0)` not model ranges::sized_range in C++20?
                            
                                Ambiguous name lookup with C++20 using-enum-declaration
                            
                                C++ What actually happens in assembly when you return a struct from a function?
                            
                                Return one of two (or more) lambdas in C++
                            
                                Similar random number generation in python and c++ but getting different output
                            
                                What does "owning" mean in the context of programming? [duplicate]
                            
                                Why does int addition though pointers take one less x86 instruction than int multiplication through pointers?
                            
                                Why does default-constructibility behave weirdly for inner structs with NSDMI?
                            
                                Why is std::max not working for string literals?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What actually is done when `string::c_str()` is invoked?

Tags:

c++

string

stl

John

People also ask

3 Answers

Remy Lebeau

OrenIshShalom

selbie

Recent Activity

Donate For Us