I believe a common response to this is "no," as the end()
iterator for containers represents a "past-the-end" address which is undefined behavior to dereference. I can't find an explicit statement in the standard that exempts strings from this constraint, even though strings have a special case over other containers.
The C++11 standard declares that you can read one index past the end of a string. string[size()]
references a read-only value of a null terminator.
24.3.2.5 basic_string element access [string.access]
const_reference operator[](size_type pos) const;
reference operator[](size_type pos);
(1) Requires:
pos <= size()
.(2) Returns:
*(begin() + pos) if pos < size()
. Otherwise, returns a reference to an object of typecharT
with valuecharT()
, where modifying the object to any value other thancharT()
leads to undefined behavior.
front()
is defined to be equivalent to return operator[](0)
which is equivalent to return operator[](size())
for an empty string.
end() - begin()
is well-defined to be a difference of the length of the string, so end()
must be pointing to the index of size()
for a sane implementation to define that arithmetic.
In the above standard excerpt, it states that operator[](pos)
is equivalent to *(begin() + pos)
if pos < size()
. It does not say that you can dereference begin() + size()
, but do you think it is reasonable to assume that this should be well defined? Or better yet, do you know of some proof that exempts string iterators from the constraint?
Additionally, can it be proven that *(begin() + i)
for any i
is equivalent to operator[](i)
?
In C++, you cannot dereference an iterator straight away because the end() function returns an iterator and object as a pointer, which isn't a valid member of the data structure.
std::string::end Returns an iterator pointing to the past-the-end character of the string. The past-the-end character is a theoretical character that would follow the last character in the string.
The string is saved at some memory location in the binary (when the source is compiled). A string like "hello" is converted to a char * (pointer to char). Therefore when you dereference it, it will get you the first char of your "string".
From the definition of string.end():
Returns: An iterator which is the past-the-end value.
and from the definition for past-the-end:
... Such a value is called a past-the-end value. Values of an iterator i for which the expression *i is defined are called dereferenceable. The library never assumes that past-the-end values are dereferenceable. ...
The emphasis is mine, and I would guess that any exception made for std::string
would be mentioned in the first link. Since it's not, dereferencing std::string.end()
is undefined by omission.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With