This code has undefined behavior:
#include <string_view>
#include <iostream>
using namespace std::string_view_literals;
void foo(std::string_view msg) {
std::cout << msg.data() << '\n'; // undefined behavior if 'msg' is not null-
// terminated
// std::cout << msg << '\n'; is not undefined because operator<< uses
// iterators to print 'msg', but that's not the point
}
int main() {
foo("hello"sv); // not null-terminated - undefined behavior
foo("foo"); // same, even more dangerous
}
The reason why is that std::string_view
can store non-null terminated strings, and doesn't include a null terminator when calling data
. That's really limiting, as to make the above code defined behavior, I have to construct a std::string
out of it:
std::string str{ msg };
std::cout << str.data() << '\n';
This really makes std::string_view
unnecessary in this case, I still have to copy the string passed to foo
, so why not use move semantics and change msg
to a std::string
? This might be faster, but I didn't measure.
Either way, having to construct a std::string
every time I want to pass a const char*
to a function which only accepts a const char*
is a bit unnecessary, but there has to be a reason why the Committee decided it this way.
So, why does std::string_view::data
not return a null-terminated string like std::string::data
?
So, why does std::string_view::data not return a null-terminated string like std::string::data
Simply because it can't. A string_view
can be a narrower view into a larger string (a substring of a string). That means that the string viewed will not necessary have the null termination at the end of a particular view. You can't write the null terminator into the underlying string for obvious reasons and you can't create a copy of the string and return char *
without a memory leak.
If you want a null terminating string, you would have to create a std::string
copy out of it.
Let me show a good use of std::string_view
:
auto tokenize(std::string_view str, Pred is_delim) -> std::vector<std::string_view>
Here the resulting vector contains tokens as views into the larger string.
The purpose of string_view
is to be a range representing a contiguous sequence of characters. Limiting such a range to one that ends in a NUL-terminator limits the usefulness of the class.
That being said, it would still be useful to have an alternate version of string_view
which is intended only to be created from strings that truly are NUL-terminated.
My zstring_view
class is privately inherited from string_view
, and it provides support for removing elements from the front and other operations that cannot make the string non-NUL-terminated. It provides the rest of the operations, but they return a string_view
, not a zstring_view
.
You'd be surprised how few operations you have to lose from string_view
to make this work:
template<typename charT, typename traits = std::char_traits<charT>>
class basic_zstring_view : private basic_string_view<charT, traits>
{
public:
using base_view_type = basic_string_view<charT, traits>;
using base_view_type::traits_type;
using base_view_type::value_type;
using base_view_type::pointer;
using base_view_type::const_pointer;
using base_view_type::reference;
using base_view_type::const_reference;
using base_view_type::const_iterator;
using base_view_type::iterator;
using base_view_type::const_reverse_iterator;
using base_view_type::reverse_iterator;
using typename base_view_type::size_type;
using base_view_type::difference_type;
using base_view_type::npos;
basic_zstring_view(const charT* str) : base_view_type(str) {}
constexpr explicit basic_zstring_view(const charT* str, size_type len) : base_view_type(str, len) {}
constexpr explicit basic_zstring_view(const base_view_type &view) : base_view_type(view) {}
constexpr basic_zstring_view(const basic_zstring_view&) noexcept = default;
basic_zstring_view& operator=(const basic_zstring_view&) noexcept = default;
using base_view_type::begin;
using base_view_type::end;
using base_view_type::cbegin;
using base_view_type::cend;
using base_view_type::rbegin;
using base_view_type::rend;
using base_view_type::crbegin;
using base_view_type::crend;
using base_view_type::size;
using base_view_type::length;
using base_view_type::max_size;
using base_view_type::empty;
using base_view_type::operator[];
using base_view_type::at;
using base_view_type::front;
using base_view_type::back;
using base_view_type::data;
using base_view_type::remove_prefix;
//`using base_view_type::remove_suffix`; Intentionally not provided.
///Creates a `basic_string_view` that lacks the last few characters.
constexpr basic_string_view<charT, traits> view_suffix(size_type n) const
{
return basic_string_view<charT, traits>(data(), size() - n);
}
using base_view_type::swap;
template<class Allocator = std::allocator<charT> >
std::basic_string<charT, traits, Allocator> to_string(const Allocator& a = Allocator()) const
{
return std::basic_string<charT, traits, Allocator>(begin(), end(), a);
}
constexpr operator base_view_type() const {return base_view_type(data(), size());}
using base_view_type::to_string;
using base_view_type::copy;
using base_view_type::substr;
using base_view_type::operator==;
using base_view_type::operator!=;
using base_view_type::compare;
};
When dealing with string literals with known null terminators I usually use something like this to make sure the null is included in the counted chars.
template < size_t L > std::string_view string_viewz(const char (&t) [L])
{
return std::string_view(t, L);
}
The aim here is not to try to fix the compatibility issue, there are too many. But if you know what you are doing at want the string_view span to have a null ( Serialization ) then it is a nice trick.
auto view = string_viewz("Surrogate String");
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With