A continuation on C++ and UTF8 - Why not just replace ASCII?
Why is there no std::ustring
which could replace both std::string
, std::wstring
in new applications?
Of course with corresponding support in the standard library. Similarly to how boost::filesystem3::path
doesn't care about string representation and works with both std::string
and std::wstring
.
Why would you replace anything?
string
and wstring
are the string classes corresponding to char
and wchar_t
, which in the context of interfacing with the environment are meant to carry data encoded in, respectively, "the system's narrow-multibyte representation" and fixed-width in "the system's encoding".
On the other hand, u8
/u
/U
, as well as char16_t
and char32_t
, as well as the corresponding string classes, are intended for the storage of Unicode codepoint sequences encoded in UTF-8/16/32.
The latter is a separate problem domain from the former. The standard doesn't contain a mechanism to bridge the two domains (and a library such as iconv()
is typically required to make this bridge portable, e.g. by transcoding WCHAR_T/UTF-32).
Here's my standard list of related questions: #1, #2, #3
There's std::u16string
and std::u32string
. Standard libraries where you might want to use these, e.g. to name a file to open with fstream, aren't going to be changed to use these because they really can't. For example some platforms take an almost arbitrary byte string to name a file to open, with no specified encoding. Having to run that through a string with a specific encoding would break things and be incompatible.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With