Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can `std::basic_string::operator[]` return a "distant" protected page nul terminator?

So, operator[] does not directly say that s[s.size()] must be the character after s[s.size()-1] in memory. It seems worded to avoid making that claim.

But s.data() states that s.data()+k == &s[k], and s.data() must return a pointer.

Ignoring the seeming standard defect of using & on CharT above and not std::addressof, is the implementation free to return a different CharT (say, one on a protected page, or in ROM) for s[s.size()] prior to the first call to s.data()? (Clearly it could arrange the buffer to end on a read-only page with a zero on it; I'm talking about a different situation)

To be explicit:

As far as I can tell, if s.data() is never called (and the compiler can prove it), then s[s.size()] need not be contiguous with the rest of the buffer.

Can std::addressof(s[s.size()]) change after a call to s.data() and the implementation be standards-compliant (so long as s.data()+k == &s[k] has .data() evaluated before [], but the compiler is free to enforce that). Or are there immutability requirements I cannot see?

like image 607
Yakk - Adam Nevraumont Avatar asked Dec 08 '15 18:12

Yakk - Adam Nevraumont


1 Answers

Since C++11, std::string is required to be stored in contiguous memory. This is the quote from the C++11 standard (section 24.4.1.4):

The char-like objects in a basic_string object shall be stored contiguously. That is, for any basic_string object s , the identity &*(s.begin() + n) == &*s.begin() + n shall hold for all values of n such that 0 <= n < s.size() .

This quote about the return value of operator[] states that it returns the same as &*(s.begin()+n) (section 21.4.5.1):

*(begin() + pos) if pos < size() . Otherwise, returns a reference to an object of type charT with value charT() , where modifying the object leads to undefined behavior

Then we have this quote on the return value of data() in (section 24.4.7.1):

A pointer p such that p + i == &operator[](i) for each i in [0,size()] .

So data returns the same as you would get using the &operator[]. And any value between you retrieve using the &operator should be stored contiguously. So you can conclude both return a pointer to contiguous memory. So it will not return a pointer to a distance page.

Note that this only applies to C++11. Such guarantees were not made by the standard before C++11.

like image 57
Shadowwolf Avatar answered Oct 06 '22 01:10

Shadowwolf