Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does `<cuchar>` provide, and where is it documented?

Tags:

c++

c++11

unicode

The new C++11 standard mentions a header <cuchar>, presumably in analogy to C99's <uchar.h>.

Now, we know that C++11 brings new character types and literals that are specifically designed for UTF16 and UTF32, but I didn't think the language would actually contain functions to convert the (system-dependent) narrow multibyte encoding to one of the Unicode encodings. However, I just came across the header synopsis for <cuchar> that mentions functions mbrtoc16/c16rtombr and mbrtoc32/c32rtombr that seem to do just that.

Unfortunately, the standard says nothing about those functions beyond the header synopsis. Where are those functions defined, what do they really do and where can I read more about them? Does this mean that one can use proper Unicode entirely with standard C++ now, without the need for any extra libraries?

like image 759
Kerrek SB Avatar asked Sep 26 '11 23:09

Kerrek SB


2 Answers

These were described in a WG21 paper from 2005 but the description is not present in the final standard. They are documented in ISO/IEC 19769:2004 (Extensions for the programming language C to support new character data types) (draft), which the C++11 standard refers to.

The text is too long to post here, but these are the signatures:

size_t mbrtoc16(char16_t * pc16, const char * s, size_t n, mbstate_t * ps);
size_t c16rtomb(char * s, char16_t c16, mbstate _t * ps);
size_t mbrtoc32(char32_t * pc32, const char * s, size_t n, mbstate_t * ps);
size_t c32rtomb(char * s, char32_t c32, mbstate_t * ps);

The functions convert between multibyte characters and UTF-16 or UTF-32 characters, respectively, similar to mbrtowc. There are no non-reentrant versions, and honestly, who needs them?

like image 134
R. Martinho Fernandes Avatar answered Nov 09 '22 09:11

R. Martinho Fernandes


Probably the best documentation of which I'm aware is in n1326, the proposal to add TR19769 to the C standard library [Edit: though looking at it, the N1010 that R. Martinho Fernandes cited seems to have pretty much the same].

like image 1
Jerry Coffin Avatar answered Nov 09 '22 11:11

Jerry Coffin