Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between char* and wchar_t*

Tags:

c++

c

string

I am new to MFC. I am trying to do simple mfc application and I'm getting confuse in some places. For example, SetWindowText have two api, SetWindowTextA, SetWindowTextW one api takes char * and another one accepts wchar_t *.

What is the use of char * and wchar_t *?

like image 634
bala Avatar asked Oct 23 '13 04:10

bala


People also ask

What does wchar_t mean in C?

The wchar_t type is an implementation-defined wide character type. In the Microsoft compiler, it represents a 16-bit wide character used to store Unicode encoded as UTF-16LE, the native character type on Windows operating systems.

What is difference between Tchar and char?

TCHAR is simply a macro that expands to char in ANSI builds (i.e. _UNICODE is not defined) and wchar_t in Unicode builds ( _UNICODE is defined). There are various string types based on the TCHAR macro, such as LPCTSTR (long pointer to a constant TCHAR string).

Should I use wchar_t?

No, you should not! The Unicode 4.0 standard (ISO 10646:2003) notes that: The width of wchar_t is compiler-specific and can be as small as 8 bits. Consequently, programs that need to be portable across any C or C++ compiler should not use wchar_t for storing Unicode text.

What is the size of wchar_t in C Plus Plus?

Just like the type for character constants is char, the type for wide character is wchar_t. This data type occupies 2 or 4 bytes depending on the compiler being used.


3 Answers

char is used for so called ANSI family of functions (typically function name ends with A), or more commonly known as using ASCII character set.

wchar_t is used for new so called Unicode (or Wide) family of functions (typically function name ends with W), which use UTF-16 character set. It is very similar to UCS-2, but not quite it. If character requires more than 2 bytes, it will be converted into 2 composite codepoints, and this can be very confusing.

If you want to convert one to another, it is not really simple task. You will need to use something like MultiByteToWideChar, which requires knowing and providing code page for input ANSI string.

like image 199
mvp Avatar answered Oct 31 '22 06:10

mvp


On Windows, APIs that take char * use the current code page whereas wchar_t * APIs use UTF-16. As a result, you should always use wchar_t on Windows. A recommended way to do this is to:

// Be sure to define this BEFORE including <windows.h>
#define UNICODE 1
#include <windows.h>

When UNICODE is defined, APIs like SetWindowText will be aliased to SetWindowTextW and can therefore be used safely. Without UNICODE, SetWindowText will be aliased to SetWindowTextA and therefore cannot be used without first converting to the current code page.

However, there's no good reason to use wchar_t when you are not calling Windows APIs, since its portable functionality is not useful, and its useful functionality is not portable (wchar_t is UTF-16 only on Windows, on most other platforms it is UTF-32, what a total mess.)

like image 45
Dietrich Epp Avatar answered Oct 31 '22 07:10

Dietrich Epp


SetWindowTextA takes char* which is a pointer to an ANSI strings and SetWindowTextW take wchar_t* can point to wide strings aka Unicode.

SetWindowText has been #defined to either of these in header Windows.h based on the type of application you are building. If you are building UNICODE build then your code will automatically use SetWindowTextW.

SetWindowTextA is there primarily to support legacy code which needs to be build as SBCS (Single byte character set)

like image 1
Tushar Jadhav Avatar answered Oct 31 '22 07:10

Tushar Jadhav