Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why might one use the xzr register instead of the literal 0 on ARMv8?

I was reading the SVE whitepaper from ARM and came across something that struck me as odd (in a non-SVE example):

mov x8, xzr

I didn't know what this xzr register was, so I looked it up and found some content from ARM stating that it was, in many contexts, synonymous with zero.

So it looks like x8 is being initialised to zero, which makes sense because it's executed just before a loop where x8 is used as the loop counter.

What I don't understand is, why wasn't the literal 0 used instead of xzr? For example:

mov x8, 0

To summarise, my question is: why might one use the xzr register instead of the literal 0 here?

like image 611
OMGtechy Avatar asked Mar 14 '17 14:03

OMGtechy


People also ask

What is the XZR register?

The zero register (WZR/XZR) is used for a few encoding tricks. For example, there is no plain multiply encoding, just multiply-add. The instruction MUL W0, W1, W2 is identical to MADD W0, W1, W2, WZR which uses the zero register. Not all instructions can use the XZR/WZR.

What is XZR in assembly language?

zxr is a pseudo register that always reads zero, which is a common value that you would need, and moving a register to a register can be done in a single instruction.

What is Armv8 instruction set?

Armv8-A supports three instruction sets: A32, T32 and A64. The A64 instruction set is used when executing in the AArch64 Execution state. It is a fixed- length 32-bit instruction set. The '64' in the name refers to the use of this instruction by the AArch64 Execution state.


2 Answers

I think the mov x8, xzr vs mov x8, #0 comparison is something of a red herring.

As @old_timer's answer shows, there is no encoding gain to be made, and quite likely (although admittedly I haven't checked) little or no pipeline performance gain.

What xzr gives us, however - in addition to a dummy register as per @InfinitelyManic's answer - is access to a zero-valued operand without having to load and occupy a real register. This has the dual benefit of one less instruction, and one more register available to hold 'real' data.

I think this is an important characteristic that the original 'some content from ARM' referred to in the OP neglects to point out.

That's what I mean by mov x8, xzr vs mov x8, #0 being a red herring. If we're zeroing x8 with the intention of then modifying it, then using xzr or #0 is pretty arbitrary (although I'd tend to favour #0 as the more obvious). But if we're zeroing x8 purely in order to supply a zero operand to a subsequent instruction, then we'd be better off using - where permitted - xzr instead of x8 as the operand in that instruction, and not zeroing x8 at all.

like image 105
Jeremy Avatar answered Oct 05 '22 13:10

Jeremy


mov x8,xzr
mov x8,#0
mov x8,0

produces

0000000000000000 <.text>:
   0:   aa1f03e8    mov x8, xzr
   4:   d2800008    mov x8, #0x0                    // #0
   8:   d2800008    mov x8, #0x0                    // #0

No real surprise there other than it allowed an immediate without the pound sign. It is not an instruction size issue (again no surprise, with x86 for example xor rax,rax is cheaper than mov rax,0), perhaps there is a pipeline performance gain (despite popular belief instructions take more than one clock start to finish).

Most likely it is a personal preference thing we have this cool mips like always zero register thing lets use it just for fun.

like image 25
old_timer Avatar answered Oct 05 '22 13:10

old_timer