I've just started learning the WinAPI. In the MSDN, the following explanation is provided for the WORD data type.
WORD
A 16-bit unsigned integer. The range is 0 through 65535 decimal.
This type is declared in WinDef.h as follows:
typedef unsigned short WORD;
Simple enough, and it matches with the other resources I've been using for learning, but how can it can definitively said that it is 16 bits? The C data types page on Wikipedia specifies
short / short int / signed short / signed short int
Short signed integer type.
Capable of containing at least the [−32767, +32767] range; thus, it is at least 16 bits in size.
So the size of a short
could very well be 32 bits according to the C Standard. But who decides what bit sizes are going to be used anyway? I found a practical explanation here. Specifically, the line:
...it depends on both processors (more specifically, ISA, instruction set architecture, e.g., x86 and x86-64) and compilers including programming model.
So it's the ISA then, which makes sense I suppose. This is where I get lost. Taking a look at the Windows page on Wikipedia, I see this in the side bar:
Platforms ARM, IA-32, Itanium, x86-64, DEC Alpha, MIPS, PowerPC
I don't really know what these are but I think these are processors, each which would have an ISA. Maybe Windows supports these platforms because all of them are guaranteed to use 16 bits for an unsigned short
? This doesn't sound quite right, but I don't really know enough about this stuff to research any further.
Back to my question: How is it that the Windows API can typedef unsigned short WORD;
and then say WORD
is a 16-bit unsigned integer when the C Standard itself does not guarantee that a short
is always 16 bits?
2.2. A DWORD is a 32-bit unsigned integer (range: 0 through 4294967295 decimal). Because a DWORD is unsigned, its first bit (Most Significant Bit (MSB)) is not reserved for signing.
The following table contains the following types: character, integer, Boolean, pointer, and handle. The character, integer, and Boolean types are common to most C compilers.
Simply put, a WORD
is always 16 bits.
As a WORD
is always 16 bits, but an unsigned short
is not, a WORD
is not always an unsigned short
.
For every platform that the Windows SDK supports, the windows header file contains #ifdef
style macros that can detect the compiler and its platform, and associate the Windows SDK defined types (WORD
, DWORD
, etc) to the appropriately sized platform types.
This is WHY the Windows SDK actually uses internally defined types, such as WORD
, rather than using language types: so that they can ensure that their definitions are always correct.
The Windows SDK that ships with Microsoft toolchains, is possibly lazy, as Microsoft c++ toolchains always use 16bit unsigned shorts.
I would not expect the windows.h that ships with Visual Studio C++ to work correctly if dropped into GCC, clang etc. as so many details, including the mechanism of importing dll's using .iib files that the Platform SDK distribute, is a Microsoft specific implementation.
A different interpretation is that:
Microsoft says a WORD
is 16 bits. If "someone" wants to call a windows API, they must pass a 16 bit value where the API defines the field as a WORD.
Microsoft also possibly says, in order to build a valid windows program, using the windows header files present in their Windows SDK, the user MUST choose a compiler that has a 16bit short
.
The c++ spec does not say that compilers must implement short
s as 16 bits - Microsoft says the compiler you choose to build windows executables must.
There was originally an assumption that all code intended to run on Windows would be compiled with Microsoft's own compiler - or a fully compatible compiler. And that's the way it worked. Borland C: Matched Microsoft C. Zortech's C: Matched Microsoft C. gcc: not so much, so you didn't even try (not to mention there were no runtimes, etc.).
Over time this concept got codified and extended to other operating systems (or perhaps the other operating systems got it first) and now it is known as an ABI - Application Binary Interface - for a platform, and all compilers for that platform are assumed (in practice, required) to match the ABI. And that means matching expectations for the sizes of integral types (among other things).
An interesting related question you didn't ask is: So why is 16-bits called a word? Why is 32-bits a dword (double word) on our 32- and now 64-bit architectures where the native machine "word" size is 32- or 64-, not 16? Because: 80286.
In the windows headers there is a lot of #define that based on the platform can ensure a WORD is 16 bit a DWORD is 32 etc etc. In some case in the past, I know they distribuite a proper SDK for each platform. In any case nothing magic, just a mixture of proper #defines and headers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With