Why in C++ (MSVS), datatypes with all caps are defined (and most of them are same)?
These are exactly the same. Why all caps versions are defined?
double
and typedef double DOUBLE
char
and typedef char CHAR
bool
and BOOL
(typedef int BOOL
), here both all small and all caps represent Boolean states, why int
is used in the latter?
What extra ability was gained through such additional datatypes?
The ALLCAPS typedefs started in the very first days of Windows programming (1.0 and before). Back then, for example, there was no such thing as a bool
type. The Windows APIs and headers were defined for old-school C. C++ didn't even exist back when they were being developed.
So to help document the APIs better, compiler macros like BOOL
were introduced. Even though BOOL
and INT
were both macros for the same underlying type (int
), this let you look at a function's type signature to see whether an argument or return value was intended as a boolean value (defined as "0 for false, any nonzero value for true") or an arbitrary integer.
As another example, consider LPCSTR
. In 16-bit Windows, there were two kinds of pointers: near
pointers were 16-bit pointers, and far
pointers used both a 16-bit "segment" value and a 16-bit offset into that segment. The actual memory address was calculated in the hardware as ( segment << 4 ) + offset
.
There were macros or typedefs for each of these kinds of pointers. NPSTR
was a near
pointer to a character string, and LPSTR
was a far
pointer to a character string. If it was a const
string, then a C
would get added in: NPCSTR
or LPCSTR
.
You could compile your code in either "small" model (using near
pointers by default) or "large" model (using far
pointers by default). The various NPxxx
and LPxxx
"types" would explicitly specify the pointer size, but you could also omit the L
or N
and just use PSTR
or PCSTR
to declare a writable or const pointer that matched your current compilation mode.
Most Windows API functions used far
pointers, so you would generally see LPxxx
pointers there.
BOOL
vs. INT
was not the only case where two names were synonyms for the same underlying type. Consider a case where you had a pointer to a single character, not a zero-terminated string of characters. There was a name for that too. You would use PCH
for a pointer to a character to distinguish it from PSTR
which pointed to a zero-terminated string.
Even though the underlying pointer type was exactly the same, this helped document the intent of your code. Of course there were all the same variations: PCCH
for a pointer to a constant character, NPCH
and LPCH
for the explicit near and far, and of course NPCCH
and LPCCH
for near and far pointers to a constant character. Yes, the use of C
in these names to represent both "const" and "char" was confusing!
When Windows moved to 32 bits with a "flat" memory model, there were no more near
or far
pointers, just flat 32-bit pointers for everything. But all of these type names were preserved to make it possible for old code to continue compiling, they were just all collapsed into one. So NPSTR
, LPSTR
, plain PSTR
, and all the other variations mentioned above became synonyms for the same pointer type (with or without a const
modifier).
Unicode came along around that same time, and most unfortunately, UTF-8 had not been invented yet. So Unicode support in Windows took the form of 8-bit characters for ANSI and 16-bit characters (UCS-2, later UTF-16) for Unicode. Yes, at that time, people thought 16-bit characters ought to be enough for anyone. How could there possibly be more than 65,536 different characters in the world?! (Famous last words...)
You can guess what happened here. Windows applications could be compiled in either ANSI or Unicode ("Wide character") mode, meaning that their default character pointers would be either 8-bit or 16-bit. You could use all of the type names above and they would match the mode your app was compiled in. Almost all Windows APIs that took string or character pointers came in both ANSI and Unicode versions, with an A
or W
suffix on the actual function name. For example, SetWindowText( HWND hwnd, LPCSTR lpString)
became two functions: SetWindowTextA( HWND hwnd, LPCSTR lpString )
or SetWindowTextW( HWND hwnd, LPCWSTR lpString )
. And SetWindowText
itself became a macro defined as one or the other of those depending on whether you compiled for ANSI or Unicode.
Back then, you might have actually wanted to write your code so that it could be compiled either in ANSI or Unicode mode. So in addition to the macro-ized function name, there was also the question of whether to use "Howdy"
or L"Howdy"
for your window title. The TEXT()
macro (more commonly known as _T()
today) fixed this. You could write:
SetWindowText( hwnd, TEXT("Howdy") );
and it would compile to either of these depending on your compilation mode:
SetWindowTextA( hwnd, "Howdy" );
SetWindowTextW( hwnd, L"Howdy" );
Of course, most of this is moot today. Nearly everyone compiles their Windows apps in Unicode mode. That is the native mode on all modern versions of Windows, and the ...A
versions of the API functions are shims/wrappers around the native Unicode ...W
versions. By compiling for Unicode you avoid going through all those shim calls. But you still can compile your app in ANSI (or "multi-byte character set") mode if you want, so all of these macros still exist.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With