Appel explains in "Runtime Tags Aren't Necessary" on page 8 how to distinguish integers from pointers by tagging pointers:
Some implementations use a low-order tag of 0 for integers, then integer addition can then be done with the ordinary machine add instruction, and no shifting or correction will be necessary (since 2x + 2y = 2(x + y)). This requires that pointers have a tag of 1; but pointer-fetches can be done with odd offsets to compensate.
The idea is: if a pointer is aligned, the value is a multiple of 2 or 4. And in that case the lower 1 or 2 bits are always zero and can be set to some value to implement a tagging to distinguish integers from pointers.
An untagged pointer fetch without offset in Intel syntax is:
mov eax, DWORD PTR [ebx]
And the equivalent tagged pointer fetch with offset is this:
mov eax, DWORD PTR [ebx-0x1]
What is the difference in cycles for the two fetches?
The complexity of the addressing mode generally has no impact on the throughput of load instructions, but it may have an impact of 1 cycle on the latency1.
In particular, a simple addressing mode, which is [base] or [base + offset] where offset < 2048 usually takes 4 cycles, while complex modes (that's anything that isn't simple) take 5 cycles. That's for loads into general purpose registers: for vector loads you usually add 1 or 2 more cycles.
So in your case, you are using only base with a very small offset, so you should get the fastest load latency of 4 cycles.
This applies to Intel, I'm not sure about AMD.
Details are in the Intel optimization guide, but here's the source I could find most quickly.
As Ross mentions in the comments, there is at least one more minor downside to using the offset: the instruction is one byte longer for the version with an offset (and would be 4 bytes longer if your offset is outside the range -128 to 127), which slightly increases pressure on the icache.
1 It goes without saying that this is for hits in L1. If you miss L1, latency will be longer - perhaps much longer and it probably doesn't matter if you still pay an extra cycle in that case (but I suppose you do, on average, since the miss doesn't get started until the address is calculated).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With