Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ WCHAR manipulations

I'm developing a tiny Win32 app in C++. I've studied C++ fundamentals long time ago, so now I completely confused because of character strings in C++. There were no WCHAR or TCHAR only char and String. After a little investigation I've decided not to use TCHAR.

My issue is very simple I think, but I can't find clear guide how to manipulate strings in C++. Affected by PHP coding last few years I've expected something simple with strings manipulations and was wrong!

Simply, all I need is to put new data to a character string.

    WCHAR* cs = L"\0";
    swprintf( cs, "NEW DATA" );

This was my first attempt. When debugging my app I've investigated that swprintf puts only first 2 chars to my cs var. I've resolved my problem this way:

    WCHAR cs[1000];
    swprintf( cs, "NEW DATA" );

But generally this trick could fail, because in my case new data is not constant value but another variable, that could potentialy be wider, than 1000 chars long. And my code is looks like this:

    WCHAR cs[1000];
    WCHAR* nd1;
    WCHAR* nd2;
    wcscpy(nd1, L"Some value");
    wcscpy(nd2, L"Another value"); // Actually these vars stores the path for user selected folder
    swprintf( cs, "The paths are %s and %s", nd1, nd2);

In this case there is possibility than nd1 and nd2 total character count could be greater than 1000 chars so critical data will be lost.

The question is how can I copy all data I need to WCHAR string declared this way WCHAR* wchar_var; without losing anything?

P.S. Since I'm Russian the question may be unclear. Let me now about that, and I'll try to explain my issue more clear and complex.

like image 406
Geradlus_RU Avatar asked Nov 13 '12 11:11

Geradlus_RU


2 Answers

In modern Windows programming, it's OK to just ignore TCHAR and instead use wchar_t (WCHAR) and Unicode UTF-16.

(TCHAR is a model of the past, when you wanted to have a single code base, and produce both ANSI/MBCS and Unicode builds changing some preprocessor switches like _UNICODE and UNICODE.)

In any case, you should use C++ and convenient string classes to simplify your code. You can use ATL::CString (which corresponds to CStringW in Unicode builds, which are the default since VS2005), or STL's std::wstring.

Using CString, you can do:

CString str1 = L"Some value";
CString str2 = L"Another value";
CString cs;
cs.Format(L"The paths are %s and %s", str1.GetString(), str2.GetString());

CString also provides proper overloads of operator+ to concatenate strings (so you don't have to calculate the total length of the resulting string, dynamically allocate a buffer for the destination string or check existing buffer size, call wcscpy, wcscat, don't forget to release the buffer, etc.)

And you can simply pass instances of CString to Win32 APIs expecting const wchar_t* (LPCWSTR/PCWSTR) parameters, since CString offers an implicit conversion operator to const wchar_t*.

like image 72
Mr.C64 Avatar answered Sep 24 '22 05:09

Mr.C64


When you're using a WCHAR*, you are invoking undefined behavior because you have a pointer but have not made it point to anything valid. You need to find out how long the resulting string will be and dynamically allocate space for the string. For example:

WCHAR* cs;
WCHAR* nd1;
WCHAR* nd2;

nd1 = new WCHAR[lstrlen(L"Some value") + 1]; // +1 for the null terminator
nd2 = new WCHAR[lstrlen(L"Another value") + 1];
cs = new WCHAR[lstrlen(L"The paths are  and ") + lstrlen(nd1) + lstrlen(nd2) + 1];

wcscpy(nd1, L"Some value");
wcscpy(nd2, L"Another value"); // Actually these vars stores the path for user selected folder
swprintf( cs, L"The paths are %s and %s", nd1, nd2);

delete[] nd1;
delete[] nd2;
delete[] cs;

But this is very ugly and error-prone. As noted, you should be using std::wstring instead, something like this:

std::wstring cs;
std::wstring nd1;
std::wstring nd2;

nd1 = L"Some value";
nd2 = L"Another value";
cs = std::wstring(L"The paths are ") + nd1 + L" and " + nd2;
like image 33
Dark Falcon Avatar answered Sep 21 '22 05:09

Dark Falcon