During reading of an Intel manual book I came across the following:
On processors that support Intel 64 architecture, the
IA32_SYSENTER_ESP
field and theIA32_SYSENTER_EIP
field must each contain a canonical address.
What is a 'canonical address'?
Canonical addresses are the official addresses that are supplied to us by the PSMA or Australia Post. They contain the correct street name, the correct suburb and are spelled correcty. By default, the AddressFinder widget always selects the canonical addresses when performing autocomplete and verification operations.
You're not going to see a system which needs more than that any time soon. So CPU manufacturers took a shortcut. They use an instruction set which allows a full 64-bit address space, but current CPUs just only use the lower 48 bits.
"AMD64" is the name chosen by AMD for their 64-bit extension to the Intel x86 instruction set. Before release, it was called "x86-64" or "x86_64", and some distributions still use these names.
x86-64 is a 64-bit processing technology developed by AMD that debuted with the Opteron and Athlon 64 processor. x86-64 is also known as x64 and AMD64. x86-64 enables 64-bit processing advantages such as increased memory space (up to 256TB) and processing more data per clock cycle.
In 64-bit mode, an address is considered to be in canonical form if address bits 63 through to the most-significant implemented bit by the microarchitecture are set to either all ones or all zeros. Now, the most significat implemented bit on current operating systems and architectures is the 47th bit. This leaves us with a 48-bit address space.
@marko: no, canonical or not only applies to virtual addresses. With a 4-level page table (Why in 64bit the virtual address are 4 bits short (48bit long) compared with the physical address (52 bit long)? ), there's only enough room to translate 48 bits, and canonical = correctly sign-extended to 64.
In 64-bit mode, an address is considered to be in canonical form if address bits 63 through to the most-significant implemented bit by the microarchitecture are set to either all ones or all zeros. It goes on from there...
5 Address canonical form and pointer arithmetic 20 Why in x86-64 the virtual address are 4 bits shorter than physical (48 bits vs. 52 long)?
I suggest that you download the full software developer's manual. The documentation is available in separate volumes, but that link gives you all seven volumes in a single massive PDF, which makes it easier to search for things.
The answer is in section 3.3.7.1. The first line of that section states
In 64-bit mode, an address is considered to be in canonical form if address bits 63 through to the most-significant implemented bit by the microarchitecture are set to either all ones or all zeros.
It goes on from there...
You can use cpuid
to query the supported virtual address width on that CPU. (i.e. "implemented by the microarchitecture".) Or you can normally just assume 48-bit.
I.e. a canonical virtual address is 48 bits correctly sign-extended to 64. If the high bits don't match, it's non-canonical and will fault if you attempt to dereference it.
(Or with Intel's upcoming 5-level page table extension, 57 bits sign-extended to 64).
This answer less detailed then previous ones but IMHO easier to understand:
While 64-bit processors have 64-bit wide registers, systems generally do not implement all 64-bits for addressing (16 exabytes of theoretical physical memory).
Thus most architectures define an unimplemented region of the address space which the processor will consider invalid for use. x86-64 (...) define the most-significant valid bit of an address, which must then be sign-extended (...) to create a valid address. The result of this is that the total address space is effectively divided into two parts, an upper and a lower portion, with the addresses in-between considered invalid. (...) Valid addresses are termed canonical addresses (invalid addresses being non-canonical).
From https://www.bottomupcs.com/virtual_memory_is.xhtml
Sign-extended
is same bit most significant bit copied to the upper bits address. Upper is 11111...
lower 00000...
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With