Quick summary: in x86-64 mode, are far jumps as slow as in x86-32 mode?
On the x86 processor, jumps fall into three types:
short and near jumps take 1-2 clock cycles, while far jumps take 50-80 clock cycles, depending on processor. From my reading of the documentation, this is because they "go outside CS, the current code segment."
In x86-64 mode, code segments aren't used - The segment is effectively always 0..infinity. Ergo, there shouldn't be a penalty for going outside a segment.
Thus, the question: Does the number of clock cycles change for a far jump if the processor is in x86-64 mode?
Related bonus question: Most *nix-like operating systems running in 32bit protected mode explicitly set the segment sizes to 0..infinity and manage the linear -> physical translation entirely through the page tables. Do they get a benefit from this in terms of the time for far calls (fewer clock cycles), or is the penalty really an internal CPU legacy from the size segment registers have been since the 8086?
CS is used not only for base and limit, but also for permissions. The CPL is encoded there, as well as other fields such as:
Far jumps can also go through a task gate, and far calls can also go through call gates. All of these have to be handled, regardless of 64-bit mode.
To sum up, a far jump in 64-bit mode is no faster than in 32-bit mode. In fact, considering that when 64-bit mode is enabled, segment descriptors are twice as large as when 64-bit mode is disabled, all descriptor-table accesses are doubled, which may lengthen the time of the jump.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With