Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

UnicodeString compatibility issue

I am porting an older project from C++ Builder 2009 to XE5. In the old project, the compiler option for Unicode strings was set as "_TCHAR maps to: char". This worked fine in the old project.

When porting it, I set the same compiler option in XE5. But I still get compiler errors for code like this:

std::string str = String(some_component.Text).t_str();

This gives the following errors:

[bcc32 Warning] file.cpp(89): W8111 Accessing deprecated entity 'UnicodeString::t_str() const'

[bcc32 Error] file.cpp(89): E2285 Could not find a match for 'operator string::=(wchar_t *)'

So apparently XE5 has decided that String::t_str() should give me a wchar_t* rather than a char*, even though I have set the compiler option as described above.

How do I solve this?

I am well-aware that C++ Builder has taken the step to use Unicode internally (even in the 2009 version), but this is an old project with 200k LOC. Updating it to Unicode would be a steep task with very low priority.

EDIT

I can get it to work by changing the code to

std::string str = AnsiString(some_component.Text).c_str();

But this means I have to change the code at numerous places. Is there a better way that doesn't involve rewriting code?

like image 757
Lundin Avatar asked Jan 11 '23 09:01

Lundin


1 Answers

When UnicodeString::t_str() was first introduced in CB2009, it returned either a char* or wchar_t* depending on what TCHAR mapped to. In order to return a char*, it ALTERED the internal data of the UnicodeString to make it Ansi (thus breaking the contract that UnicodeString is a Unicode string). THIS WAS TEMPORARY for migration purposes while people were still re-writing their code to support Unicode. This breakage was acceptable because the RTL had special logic to handle Ansi-encoded UnicodeString (and Unicode-encoded AnsiString) values. However, this was dangerous code. After a few versions, when people had adequate time to migrate, this RTL logic was removed and UnicodeString::t_str() was locked in to wchar_t* only, to match UnicodeString::c_str(). DO NOT USE t_str() anymore! That is why it is marked as deprecated now. If you need to pass a UnicodeString to something that expects Ansi data, converting to an intermediate AnsiString is the correct and safe approach. That is just the way it is now.

like image 104
Remy Lebeau Avatar answered Jan 25 '23 05:01

Remy Lebeau