I know the UTF-16 has two types of endiannesses: big endian and little endian.
Does the C++ standard define the endianness of std::wstring? or it is implementation-defined?
If it is standard-defined, which page of the C++ standard provide the rules on this issue?
If it is implementation-defined, how to determine it? e.g. under VC++. Does the compiler guarantee the endianness of std::wstring is strictly dependent on the processor?
I have to know this; because I want to send the UTF-16 string to others. I must add the correct BOM in the beginning of the UTF-16 string to indicate its endianness.
In short: Given a std::wstring, how should I reliably determine its endianness?
Endianess is MACHINE dependent, not language dependent. Endianess is defined by the processor and how it arranges data in and out of memory. When dealing with wchar_t (which is wider than a single byte), the processor itself upon a read or write aligns the multiple bytes as it needs to in order to read or write it back to RAM again. Code simply looks at it as the 16 bit (or larger) word as represented in a processor internal register.
For determining (if that is really what you want to do) endianess (on your own), you could try writing a KNOWN 32 bit (unsigned int) value out to ram, then read it back using a char pointer. Look for the ordering that is returned.
It would look something like this:
unsigned int aVal = 0x11223344;
char * myValReadBack = (char *)(&aVal);
if(*myValReadBack == 0x11) printf("Big endian\r\n");
else printf("Little endian\r\n");
Im sure there are other ways, but something like the above should work, check my little versus big though :-)
Further, until Windows RT, VC++ really only compiled to intel type processors. They really only have had 1 endianess type.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With