What is Microsoft using as the data type for Unicode Strings?

Tags:

I am in the process of learning C++ and came across an article on the MSDN here:

http://msdn.microsoft.com/en-us/magazine/dd861344.aspx

In the first code example the one line of code which my question relates to is the following:

VERIFY(SetWindowText(L"Direct2D Sample"));

More specifically that L prefix. I had a little read up, and correct me if I am wrong :-), but this is to allow for unicode strings, i.e. to prep for a long character set. Now in during my read up on this I came across another article on Adavnced String Techniques in C here http://www.flipcode.com/archives/Advanced_String_Techniques_in_C-Part_I_Unicode.shtml

It says there are a few options including the inclusion of the header:

#define UNICODE

#define _UNICODE

in C , again point out if I am wrong, appreciate your feedback. Further it shows the datatype suitable for these unicode strings being:

wchar_t

It throws into the mix a macro and a kind of hybrid datatype, the macro being:

_TEXT(t)

which simply prefixes the string with the L and the hybrid data type as

TCHAR

Which it points out will allow for unicode if the header is there and ASCII if not. Now my question is, or more of an asumption which I would like to confirm, would Microsoft use this TCHAR data type which is more flexible or is there any benefit to committing to using the wchar_t.

Also when I say does Microsoft use this, more specifically for exmaple in the ATL and WTL libraries, do anyone of yourselves have preference or have some advice regarding this?

Cheers,

Andrew

683

asked Aug 27 '09 10:08

REA_ANDREW

2 Answers

For all new software you should define UNICODE and use wchar_t directly. Using ANSI stirngs will come back to haunt you.

You should just use wchar_t and the wide versions of all the CRT functions (ex: wcscmp instead of strcmp). The TEXT macros and TCHAR etc just exist if your code needs to work in both ANSI and UNICODE environments which I feel code rarely needs to do.

When you create a new windows application using Visual Studio UNICODE is automatically defined and wchar_t will work like a built-in.

answered Sep 20 '22 18:09

obelix

Short answer: the hybrid infrastructure with the TCHAR type, the _TEXT() macro and the various _t* functions (_tcscpy comes to mind) are a throwback to the times when Microsoft had two platforms coexisting:

The Windows NT line was based on the Unicode string representation
The Windows 95/98/ME line was based on ANSI string representation.

String representation here means that all the Windows APIs that expected or returned string to your app used one or the other representation for these strings. COM added even more confusion as it was available on both platforms -- and expected Unicode strings on both!

In those old times it was encouraged that you write "portable" code: you were instructed to use the hybrid infrastructure for your strings so that you can compile for both models just by defining/undefining UNICODE and/or _UNICODE for your app.

As the Windows9x line is no more relevant (for the vast majority of the apps anyway) you can safely ignore the ANSI world and use the Unicode strings directly.

Beware though that Unicode has multiple representations today: as it is pointed out above the Unicode convention implied by wchar_t is the UCS-2 representation (all characters encoded in 16-bit words). There are other, widely used representations where this is not necessarily true.

answered Sep 24 '22 18:09

LaszloG

Related questions
                            
                                In C++ (class), do I always need to declare function in the header file?
                            
                                What is the C++ equivalent for the CRT?
                            
                                Is printing of a member pointer to an int defined
                            
                                How to cast double to float pointer?
                            
                                Post increment behavior
                            
                                map choose `std::greater` or `std::less` at runtime
                            
                                Do the following two declarations involving automatic return type work the same? If so, why?
                            
                                Why does catching a const int& by an int& work?
                            
                                C++ pointers, ++ operator behaviour
                            
                                when would you change std::vector::end() directly?
                            
                                Is use of empty std::optional<string> UB or not?
                            
                                Is there any advantage to using C++/CLI over either standard C++ or C#?
                            
                                Assignment inside function that is passed as pointer?
                            
                                Using C code from C++ using autotools
                            
                                Dynamic array... copy constructor, destructor, overloaded assignment operator [closed]
                            
                                How to programmatically check Internet bandwidth in VC++?
                            
                                std::map unable to handle polymorphism?
                            
                                guidelines on usage of size_t and offset_t?
                            
                                MATLAB functions in C++ [closed]
                            
                                Using STL to bind multiple function arguments

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With