I'm currently looking to find an answer to the above question. So far I found people saying, that the word size refers to the size of a processor register, which would suggest on a 64-bit machine the word size being 64 bits and thus a QWORD (4 * word) being 256 bits in size.
But on the other hand I found sources like this saying the size would be 128 bits (64 bits for 32-bit and doubled this for 64-bit), while even then others suggest the size would be 64 bits. But the last one is somehow related to Microsoft making matters worse by confusing everyone by defining the size of a word being 16 bits.
Maybe someone could solve my confusion and enlighten me on this topic.
The fundamental data types of the Intel Architecture are bytes, words, doublewords, and quadwords (see Figure 29-1). A byte is eight bits, a word is 2 bytes (16 bits), a doubleword is 4 bytes (32 bits), and a quadword is 8 bytes (64 bits).
Traditionally the term "word" refers to the size of the processor's registers and main data path. By that definition a "word" would be 32 bit on your 32-bit system and 64-bit on your 64-bit system.
long , ptr , and off_t are all 64 bits (8 bytes) in size.
In x86 terminology/documentation, a "word" is 16 bits because x86 evolved out of 16-bit 8086. Changing the meaning of the term as extensions were added would have just been confusing, because Intel still had to document 16-bit mode and everything, and instruction mnemonics like cwd
(sign-extend word to dword) bake the terminology into the ISA.
movdqa xmm0, [rdi]
.cqo
mnemonic, oct-word. (Sign-extend RAX into RDX:RAX, e.g. before idiv
)And then we have fun instruction like punpcklqdq
: shuffle together two qwords into a dqword, or pclmulqdq
for carry-less multiplication of qwords, producing a dq full result. But beyond that, SIMD mnemonics tend to be AVX vextracti128
or AVX512 (with optional per-element masking) vextractf64x4
to extract the high 256 bits of a ZMM register.
Not to mention stuff like "tbyte" = 10 byte x87 extended-precision float; x86 is weird and not everything is a power of 2. Also 48-bit seg:off 16:32 far pointers in Protected mode. (Basically never used, just the 32-bit offset part.)
Most other 64-bit ISAs evolved out of 32-bit ISAs (AArch64, MIPS64, PowerPC64, etc.), or were 64-bit from the start (Alpha), so "word" means 32 bits in that context.
daddu
is 64-bit integer addThe whole concept of "machine word" doesn't really apply to x86, with its machine-code format being a byte stream, and equal support for multiple operand-sizes, and unaligned loads/stores that mostly don't care about naturally aligned stuff, only cache line boundaries for normal cacheable memory.
Even "word oriented" RISCs can have a different natural size for registers and cache accesses than their instruction width, or what their documentation uses as a "word".
The whole concept of "word size" is over-rated in general, not just on x86. Even 64-bit RISC ISAs can load/store aligned 32-bit or 64-bit memory with equal efficiency, so pick whichever is most useful for what you're doing. Don't base your choice on figuring out which one is the machine's "word size", unless there's only one maximally efficient size (e.g. 32-bit on some 32-bit RISCs), then you can usefully call that the word size.
A "word" doesn't mean 64 bits on any 64-bit machine I've heard of. Even DEC Alpha AXP, which was designed from the ground up to be aggressively 64-bit, uses 32-bit instruction words. IIRC, the manual calls a word 32 bits.
Being able to load 64-bits into an integer register with a single instruction does not make that the "word size". Bitness and word size don't have hard specific technical meanings; most CPUs have multiple different sizes internally. (e.g. 64 byte buses between L2 and L1d cache on Intel since Haswell, along with 32-byte SIMD load/store.)
So it's basically up to the CPU vendor's documentation authors to choose what "word" (and thus dword / qword) mean for their ISA.
Fun fact: SPARC64 talks about "short word" (32 bits) vs. "long word" (64 bits), rather than word / double-word. I don't know if just "word" without any qualifier has any meaning in 64-bit SPARC documentation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With