I know that there are already several questions on StackOverflow about std::string
versus std::wstring
or similar but none of them proposed a full solution.
In order to obtain a good answer I should define the requirements:
CFStringRef
, wchar_t *
, char*
as UTF-8 or other types as they are required by OS API. Remark: I don't need code-page convertion support because I expect to use only Unicode compatible functions on all operating systems supported.I would really appreciate only one proposed solution per answer, by doing this people may vote for their prefered alternative. If you have more than one alternative just add another answer.
Please indicate something that did worked for you.
Related questions:
Most C string library routines still work with UTF-8, since they only scan for terminating NUL characters.
Unicode text can be encoded in various formats: The two most important ones are UTF-8 and UTF-16. In C++ Windows code there's often a need to convert between UTF-8 and UTF-16, because Unicode-enabled Win32 APIs use UTF-16 as their native Unicode encoding.
I would strongly recommend using UTF-8 internally in your application, using regular old char*
or std::string
for data storage. For interfacing with APIs that use a different encoding (ASCII, UTF-16, etc.), I'd recommend using libiconv, which is licensed under the LGPL.
Example usage:
class TempWstring
{
public:
TempWstring(const char *str)
{
assert(sUTF8toUTF16 != (iconv_t)-1);
size_t inBytesLeft = strlen(str);
size_t outBytesLeft = 2 * (inBytesLeft + 1); // worst case
mStr = new char[outBytesLeft];
char *outBuf = mStr;
int result = iconv(sUTF8toUTF16, &str, &inBytesLeft, &outBuf, &outBytesLeft);
assert(result == 0 && inBytesLeft == 0);
}
~TempWstring()
{
delete [] mStr;
}
const wchar_t *Str() const { return (wchar_t *)mStr; }
static void Init()
{
sUTF8toUTF16 = iconv_open("UTF-16LE", "UTF-8");
assert(sUTF8toUTF16 != (iconv_t)-1);
}
static void Shutdown()
{
int err = iconv_close(sUTF8toUTF16);
assert(err == 0);
}
private:
char *mStr;
static iconv_t sUTF8toUTF16;
};
iconv_t TempWstring::sUTF8toUTF16 = (iconv_t)-1;
// At program startup:
TempWstring::Init();
// At program termination:
TempWstring::Shutdown();
// Now, to convert a UTF-8 string to a UTF-16 string, just do this:
TempWstring x("Entr\xc3\xa9""e"); // "Entrée"
const wchar_t *ws = x.Str(); // valid until x goes out of scope
// A less contrived example:
HWND hwnd = CreateWindowW(L"class name",
TempWstring("UTF-8 window title").Str(),
dwStyle, x, y, width, height, parent, menu, hInstance, lpParam);
Same as Adam Rosenfield answer (+1), but I use UTFCPP instead.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With