My code is basically this:
wstring japan = L"日本";
wstring message = L"Welcome! Japan is ";
message += japan;
wprintf(message.c_str());
I'm wishing to use wide strings but I do not know how they're outputted, so I used wprintf. When I run something such as:
./widestr | hexdump
The hexidecimal codepoints create this:
65 57 63 6c 6d 6f 21 65 4a 20 70 61 6e 61 69 20 20 73 3f 3f
e W c l m o ! e J p a n a i s ? ?
Why are they all jumped in order? I mean if the wprintf is wrong I still don't get why it'd output in such a specific jumbled order!
edit: endianness or something? they seem to rotate each two characters. huh.
EDIT 2: I tried using wcout, but it outputs the exact same hexidecimal codepoints. Weird!
A wide character is a computer character datatype that generally has a size greater than the traditional 8-bit character. The increased datatype size allows for the use of larger coded character sets. UTF-16 is one of the most commonly used wide character encodings.
The wchar_t type is an implementation-defined wide character type. In the Microsoft compiler, it represents a 16-bit wide character used to store Unicode encoded as UTF-16LE, the native character type on Windows operating systems.
Wide characters are similar to character datatype. The main difference is that char takes 1-byte space, but wide character takes 2-bytes (sometimes 4-byte depending on compiler) of space in memory. For 2-byte space wide character can hold 64K (65536) different characters. So the wide char can hold UNICODE characters.
A wide character is a 2-byte multilingual character code. Any character in use in modern computing worldwide, including technical symbols and special publishing characters, can be represented according to the Unicode specification as a wide character.
You need to define locale
#include <stdio.h>
#include <string>
#include <locale>
#include <iostream>
using namespace std;
int main()
{
std::locale::global(std::locale(""));
wstring japan = L"日本";
wstring message = L"Welcome! Japan is ";
message += japan;
wprintf(message.c_str());
wcout << message << endl;
}
Works as expected (i.e. convert wide string to narrow UTF-8 and print it).
When you define global locale to "" - you set system locale (and if it is UTF-8 it would be printed out as UTF-8 - i.e. wstring will be converted)
Edit: forget what I said about sync_with_stdio -- this is not correct, they are synchronized by default. Not needed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With