Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does it matter which registers you use when writing assembly?

If you're writing assembly, does it matter which registers you allocate values to? Say, you store an accumulated/intermediate value in %ebx instead of %eax, which was traditionally used for that purpose. Is that bad practice? Will it affect performance?

In other words, can you treat them equally as storage space, or should you stick to using them for specific purposes?

like image 534
heapoverflow Avatar asked Jan 22 '20 22:01

heapoverflow


People also ask

What registers can you use in assembly?

As complete 32-bit data registers: EAX, EBX, ECX, EDX. Lower halves of the 32-bit registers can be used as four 16-bit data registers: AX, BX, CX and DX. Lower and higher halves of the above-mentioned four 16-bit registers can be used as eight 8-bit data registers: AH, AL, BH, BL, CH, CL, DH, and DL.

How do registers work in assembly?

What are the registers in assembly language? To speed up the processor operations, the processor includes some internal memory storage locations, called registers. The registers store data elements for processing without having to access the memory. A limited number of registers are built into the processor chip.

How many registers can you use in assembly?

The registers that are visible in assembly language are called general purpose registers and floating point registers. There are 32 general purpose registers. Each general purpose register holds a 32 bit pattern. In assembly language, these registers are named $0, $1, $2, ... , $31 .


4 Answers

First and foremost, you have to use registers that support the instructions you want to use.  Many instructions on x86 (and other architectures, though less so) have some restrictions on how registers are supported.

Take certain double register multiply and divide instructions, for example, which specifically involve eax and edx in particular uses.

Next, you want to use registers that are efficient, i.e. registers:

  • for which encodings are shorter (here is a good discussion on x64 around instruction length).  Short encodings make better usage of cache resources, which can allow larger programs to run more efficiently.

  • that are un-encumbered, i.e. due to the calling conventions, which is to say they don't incur extra (software/calling-convention defined) overhead for their usage — unless that overhead has already been paid!

  • that are the eventual destinations of the values being produced: e.g. if the second parameter, then the register that corresponds to the second value to be passed (again according to the calling convention).  If we can place the value in the right register (as needed for passing or returning values), then we can forgo a data move (aka copy) instruction.

like image 180
Erik Eidt Avatar answered Oct 21 '22 16:10

Erik Eidt


Anywhere that only your code is running, you can use whatever registers you want for whatever purposes you want. However, there's two major times when the premise is false:

  1. You need to use the stack pointer for its intended purpose, or when things like signal handlers run, they'll clobber part of memory that was actually important.

  2. Your system has a calling convention. Whenever you're calling other people's library functions (or syscalls for that matter), you need to put the arguments where they want them, and they'll put the return value in the standard location no matter where you want it.

    Your calling convention will also let functions destroy some registers without saving: volatile vs. non-volatile registers. e.g. normally FLAGS, EAX, ECX, and EDX are volatile in 32-bit x86 calling conventions, while the rest of the integer registers are preserved across a call to an ABI-compliant function. See What are the calling conventions for UNIX & Linux system calls on i386 and x86-64 for system call and user-space function calling conventions.

like image 28
Joseph Sible-Reinstate Monica Avatar answered Oct 21 '22 16:10

Joseph Sible-Reinstate Monica


If you're writing assembly, does it matter which registers you allocate values to?

For 80x86; cases where it can matter which register/s you use include:

  • complying with someone else's calling conventions (passing values in the right registers, avoiding stack use for "callee saved" by preferring "caller saved" registers)

  • using an instruction that has implied registers (MUL, DIV, MOVSQ/D/W/B, STOSQ/D/W/B, XLATB, AAA, CWD, ... - there's lots of them)

  • trying to avoid the cost of a segment register prefix when segments aren't all the same (e.g. mov [ds:bp], ... vs. mov [bx],... ).

  • avoiding address calculations that can't be encoded due to restrictions of "MOD/RM" fields (e.g. mov [di+si], ...) Mostly irrelevant for 32/64-bit code, any reg can be a base or (except ESP/RSP) index.

  • avoiding REX prefixes in 64-bit code (e.g. mov ebx,1 vs. mov r8d,1)

Say, you store an accumulated/intermediate value in %ebx instead of %eax, which was traditionally used for that purpose. Is that bad practice? Will it affect performance?

In general; it won't matter (isn't bad practice and won't effect performance); however this can depend on the surrounding code (how the value is used later) and may improve performance or reduce performance.

More specifically, optimal register allocation is difficult to achieve (an NP-complete problem) even when all registers are the same; and 80x86 (where all registers aren't the same in some cases) makes it much harder to achieve optimal register allocation. (And ties register allocation to instruction scheduling, like doing operations in a different order to minimize mov of data into / out of registers where a certain instruction needs them.)

like image 5
Brendan Avatar answered Oct 21 '22 17:10

Brendan


Say, you store an accumulated/intermediate value in %ebx instead of %eax, which was traditionally used for that purpose. Is that bad practice? Will it affect performance?

In rare cases, it can affect performance. For example, for the adc eax, imm32 instruction there is special encoding that is shorter than the encoding for other registers (see https://www.felixcloutier.com/x86/adc); assemblers typically use this shorter encoding.

However, on recent Intel processors, the shorter encoding requires more µops and has a higher latency; see Which Intel microarchitecture introduced the ADC reg,0 single-uop special case?

like image 4
Andreas Abel Avatar answered Oct 21 '22 17:10

Andreas Abel