Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

wchar_t and encoding

If I want to convert a piece of string to UTF-16, say char * xmlbuffer, do I have to convert the type to wchar_t * before encoding to UTF-16? And is char* type reqired before encoding to UTF-8?

How is wchar_t, char related to UTF-8 or UTF-16 or UTF-32 or other transformation format?

Thanks in advance for help!

like image 341
Hunter Avatar asked May 03 '12 21:05

Hunter


1 Answers

No, you don't have to change data types.

About wchar_t: the standard says that

Type wchar_t is a distinct type whose values can represent distinct codes for all members of the largest extended character set specified among the supported locales.

Unfortunately, it does not say what encoding wchar_t is supposed to have; this is implementation-dependent. So for example given

auto s = L"foo";

you can make absolutely no assumption about what the value of the expression *s is.

However, you can use an std::string as an opaque sequence of bytes that represent text in any transformation format of your choice without issue. Just don't perform standard library string-related operations on it.

like image 108
Jon Avatar answered Sep 28 '22 07:09

Jon