Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ustring - an inplace replacement of std::string/std::wstring?

A continuation on C++ and UTF8 - Why not just replace ASCII?

Why is there no std::ustring which could replace both std::string, std::wstring in new applications?

Of course with corresponding support in the standard library. Similarly to how boost::filesystem3::path doesn't care about string representation and works with both std::string and std::wstring.

like image 947
ronag Avatar asked Dec 06 '11 13:12

ronag


2 Answers

Why would you replace anything?

string and wstring are the string classes corresponding to char and wchar_t, which in the context of interfacing with the environment are meant to carry data encoded in, respectively, "the system's narrow-multibyte representation" and fixed-width in "the system's encoding".

On the other hand, u8/u/U, as well as char16_t and char32_t, as well as the corresponding string classes, are intended for the storage of Unicode codepoint sequences encoded in UTF-8/16/32.

The latter is a separate problem domain from the former. The standard doesn't contain a mechanism to bridge the two domains (and a library such as iconv() is typically required to make this bridge portable, e.g. by transcoding WCHAR_T/UTF-32).

Here's my standard list of related questions: #1, #2, #3

like image 50
Kerrek SB Avatar answered Sep 27 '22 19:09

Kerrek SB


There's std::u16string and std::u32string. Standard libraries where you might want to use these, e.g. to name a file to open with fstream, aren't going to be changed to use these because they really can't. For example some platforms take an almost arbitrary byte string to name a file to open, with no specified encoding. Having to run that through a string with a specific encoding would break things and be incompatible.

like image 37
bames53 Avatar answered Sep 27 '22 18:09

bames53