Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ builder - convert UnicodeString to UTF-8 encoded string

I try to convert UnicodeString to UTF-8 encoded string in C++ builder. I use UnicodeToUtf8() function to do that.

char * dest;
UnicodeSring src;
UnicodeToUtf8(dest,256,src.w_str(),src.Length());

but compiler shows me runtime access violation message. What I'm doing wrong?

like image 921
Paramore Avatar asked Feb 01 '13 12:02

Paramore


2 Answers

Assuming you are using C++Builder 2009 or later (you did not say), and are using the RTL's System::UnicodeString class (and not some other third-party UnicodeString class), then there is a much simplier way to handle this situation. C++Builder also has a System::UTF8String class available (it has been available since C++Builder 6, but did not become a true RTL-implemented UTF-8 string type until C++Builder 2009). Simply assign your UnicodeString to a UTF8String and let the RTL handle the memory allocation and data conversion for you, eg:

UnicodeString src = ...;
UTF8String dest = src; // <-- automatic UTF16-to-UTF8 conversion
// use dest.c_str() and dest.Length() as needed...
like image 60
Remy Lebeau Avatar answered Oct 05 '22 23:10

Remy Lebeau


This fixes the problem in the question, but the real way to do a UTF16 to UTF8 conversion is in Remy's answer below.

dest is a pointer to a random space in memory because you do not initialize it. In debug builds it probably points to 0 but in release builds it could be anywhere. You are telling UnicodeToUtf8 that dest is a buffer with room for 256 characters.

Try this

char dest[256];  // room for 256 characters
UnicodeString src = L"Test this";
UnicodeToUtf8( dest, 256, src, src.Length() );

But in reality you can use the easier:

char dest[256]; // room for 256 characters
UnicodeString src = L"Test this";
UnicodeToUtf8( dest, src, 256 );
like image 31
Gregor Brandt Avatar answered Oct 06 '22 00:10

Gregor Brandt