Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

x86-64 canonical address?

During reading of an Intel manual book I came across the following:

On processors that support Intel 64 architecture, the IA32_SYSENTER_ESP field and the IA32_SYSENTER_EIP field must each contain a canonical address.

What is a 'canonical address'?

like image 932
Rouki Avatar asked Sep 15 '14 16:09

Rouki


People also ask

What ISA canonical address?

Canonical addresses are the official addresses that are supplied to us by the PSMA or Australia Post. They contain the correct street name, the correct suburb and are spelled correcty. By default, the AddressFinder widget always selects the canonical addresses when performing autocomplete and verification operations.

Why do x86-64 systems have only a 48 bit virtual address space?

You're not going to see a system which needs more than that any time soon. So CPU manufacturers took a shortcut. They use an instruction set which allows a full 64-bit address space, but current CPUs just only use the lower 48 bits.

Is x86_64 same as AMD64?

"AMD64" is the name chosen by AMD for their 64-bit extension to the Intel x86 instruction set. Before release, it was called "x86-64" or "x86_64", and some distributions still use these names.

What x86_64 means?

x86-64 is a 64-bit processing technology developed by AMD that debuted with the Opteron and Athlon 64 processor. x86-64 is also known as x64 and AMD64. x86-64 enables 64-bit processing advantages such as increased memory space (up to 256TB) and processing more data per clock cycle.

What is the canonical form of a 64-bit address?

In 64-bit mode, an address is considered to be in canonical form if address bits 63 through to the most-significant implemented bit by the microarchitecture are set to either all ones or all zeros. Now, the most significat implemented bit on current operating systems and architectures is the 47th bit. This leaves us with a 48-bit address space.

Does the canonical sign-extension apply to virtual addresses only?

@marko: no, canonical or not only applies to virtual addresses. With a 4-level page table (Why in 64bit the virtual address are 4 bits short (48bit long) compared with the physical address (52 bit long)? ), there's only enough room to translate 48 bits, and canonical = correctly sign-extended to 64.

When is an address considered to be in canonical form?

In 64-bit mode, an address is considered to be in canonical form if address bits 63 through to the most-significant implemented bit by the microarchitecture are set to either all ones or all zeros. It goes on from there...

How many bits are in a virtual address in x86-64?

5 Address canonical form and pointer arithmetic 20 Why in x86-64 the virtual address are 4 bits shorter than physical (48 bits vs. 52 long)?


2 Answers

I suggest that you download the full software developer's manual. The documentation is available in separate volumes, but that link gives you all seven volumes in a single massive PDF, which makes it easier to search for things.

The answer is in section 3.3.7.1. The first line of that section states

In 64-bit mode, an address is considered to be in canonical form if address bits 63 through to the most-significant implemented bit by the microarchitecture are set to either all ones or all zeros.

It goes on from there...

You can use cpuid to query the supported virtual address width on that CPU. (i.e. "implemented by the microarchitecture".) Or you can normally just assume 48-bit.


I.e. a canonical virtual address is 48 bits correctly sign-extended to 64. If the high bits don't match, it's non-canonical and will fault if you attempt to dereference it.

(Or with Intel's upcoming 5-level page table extension, 57 bits sign-extended to 64).

like image 52
user3386109 Avatar answered Sep 20 '22 10:09

user3386109


This answer less detailed then previous ones but IMHO easier to understand:

While 64-bit processors have 64-bit wide registers, systems generally do not implement all 64-bits for addressing (16 exabytes of theoretical physical memory).

Thus most architectures define an unimplemented region of the address space which the processor will consider invalid for use. x86-64 (...) define the most-significant valid bit of an address, which must then be sign-extended (...) to create a valid address. The result of this is that the total address space is effectively divided into two parts, an upper and a lower portion, with the addresses in-between considered invalid. (...) Valid addresses are termed canonical addresses (invalid addresses being non-canonical).

From https://www.bottomupcs.com/virtual_memory_is.xhtml

Sign-extended is same bit most significant bit copied to the upper bits address. Upper is 11111... lower 00000....

like image 28
Marisha Avatar answered Sep 20 '22 10:09

Marisha