Hopefully a simple question: cout
seems to die when handling strings that end with a multibyte UTF-8 char, am I doing something wrong? This is with GCC (Mingw) on Win7 x64.
**Edit Sorry if I wasn't clear enough, I'm not concerned about the missing glyphs or how the bytes are interpreted, merely that they are not showing at all right after the call to cout << s4
(missing BAR). Any further cout
s after the first display no text whatsoever!
#include <cstdio>
#include <iostream>
#include <string>
int main() {
std::string s1("abc");
std::string s2("…"); // … = 0xE2 80 A6
std::string s3("…abc");
std::string s4("abc…");
//In C
fwrite(s1.c_str(), s1.size(), 1, stdout);
printf(" FOO ");
fwrite(s2.c_str(), s2.size(), 1, stdout);
printf(" BAR ");
fwrite(s3.c_str(), s3.size(), 1, stdout);
printf(" FOO ");
fwrite(s4.c_str(), s4.size(), 1, stdout);
printf(" BAR\n\n");
//C++
std::cout << s1 << " FOO " << s2 << " BAR " << s3 << " FOO " << s4 << " BAR ";
}
// results:
// abc FOO ��� BAR ���abc FOO abc… BAR
// abc FOO ��� BAR ���abc FOO abc…
If you want your program to use your current locale, call setlocale(LC_ALL, "")
as the first thing in your program. Otherwise the program's locale is C
and what it will do to non-ASCII characters is not knowable by us mere humans.
This is really no surprise. Unless your terminal is set to UTF-8 coding, how does it know that s2
isn't supposed to be "(Latin small letter a with circumflex)(Euro sign)(Pipe)",
supposing that your terminal is set to ISO-8859-1 according to http://www.ascii-code.com/
By the way, cout is not "dying" as it clearly continues to produce output after your test string.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With