If you feed a wchar_t
, char16_t
, or char32_t
value to a narrow ostream, it will print the numeric value of the code point.
#include <iostream>
using std::cout;
int main()
{
cout << 'x' << L'x' << u'x' << U'x' << '\n';
}
prints x120120120
. This is because there is an operator<<
for the specific combination of basic_ostream
with its charT
, but there aren't analogous operators for the other character types, so they get silently converted to int
and printed that way. Similarly, non-narrow string literals (L"x"
, u"x"
, U"X"
) will be silently converted to void*
and printed as the pointer value, and non-narrow string objects (wstring
, u16string
, u32string
) won't even compile.
So, the question: What is the least awful way to print a wchar_t
, char16_t
, or char32_t
value on a narrow ostream, as the character, rather than as the numeric value of the codepoint? It should correctly convert all codepoints that are representable in the encoding of the ostream, to that encoding, and should report an error when the codepoint is not representable. (For instance, given u'…'
and a UTF-8 ostream, the three-byte sequence 0xE2 0x80 0xA6 should be written to the stream; but given u'â'
and a KOI8-R ostream, an error should be reported.)
Similarly, how can one print a non-narrow C-string or string object on a narrow ostream, converting to the output encoding?
If this can't be done within ISO C++11, I'll take platform-specific answers.
(Inspired by this question.)
As you noted, there is no operator<<(std::ostream&, const wchar_t)
for a narrow ostream. If you want to use the syntax you can however teach ostream
how to do with wchar
s so that that routine is picked as a better overload that the one requiring a conversion to an integer first.
If you're feeling adventurous:
namespace std {
ostream& operator<< (ostream& os, wchar_t wc) {
if(unsigned(wc) < 256) // or another upper bound
return os << (unsigned char)wc;
else
throw your_favourite_exception; // or handle the error in some other way
}
}
Otherwise, make a simple struct
that transparently encompasses a wchar_t
and has a custom friend operator<<
and convert your wide characters to that before outputting them.
Edit: To make an on-the-fly conversion to and from the locale, you can use the functions from <cwchar>
, like:
ostream& operator<< (ostream& os, wchar_t wc) {
std::mbstate_t state{};
std::string mb(MB_CUR_MAX, '\0');
size_t ret = std::wcrtomb(&mb[0], wc, &state);
if(ret == static_cast<std::size_t>(-1))
deal_with_the_error();
return os << mb;
}
Don't forget to set your locale to the system default:
std::locale::global(std::locale(""));
std::cout << L'ŭ';
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With