Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does eax contain the number of vector parameters?

Why does al contain the number of vector parameters in assembly?

Why are vector parameters any different from normal parameters for the callee?

like image 514
Riolku Avatar asked Mar 05 '23 18:03

Riolku


1 Answers

The value is used for optimization as stated in the ABI document

The prologue should use %al to avoid unnecessarily saving XMM registers. This is especially important for integer only programs to prevent the initialization of the XMM unit.

3.5.7 Variable Argument Lists - The Register Save Area. System V Application Binary Interface version 1.0

When you call va_start it'll save all the parameters passed in registers to the register save area

To start, any function that is known to use va_start is required to, at the start of the function, save all registers that may have been used to pass arguments onto the stack, into the “register save area”, for future access by va_start and va_arg. This is an obvious step, and I believe pretty standard on any platform with a register calling convention. The registers are saved as integer registers followed by floating point registers...

https://blog.nelhage.com/2010/10/amd64-and-va_arg/

But saving all 8 vector registers could be slow so the compiler may choose to optimize it using the value passed in al

... As an optimization, during a function call, %rax is required to hold the number of SSE registers used to hold arguments, to allow a varargs caller to avoid touching the FPU at all if there are no floating point arguments.

https://blog.nelhage.com/2010/10/amd64-and-va_arg/

Since you want to save at least the registers used, the value can be larger than the real number of used registers. That's why there's this line in the ABI

The contents of %al do not need to match exactly the number of registers, but must be an upper bound on the number of vector registers used and is in the range 0–8 inclusive.

You can see the effect from the prolog of ICC

    sub       rsp, 216                                      #5.1
    mov       QWORD PTR [8+rsp], rsi                        #5.1
    mov       QWORD PTR [16+rsp], rdx                       #5.1
    mov       QWORD PTR [24+rsp], rcx                       #5.1
    mov       QWORD PTR [32+rsp], r8                        #5.1
    mov       QWORD PTR [40+rsp], r9                        #5.1
    movzx     r11d, al                                      #5.1
    lea       rax, QWORD PTR [r11*4]                        #5.1
    lea       r11, QWORD PTR ..___tag_value_varstrings(int, ...).6[rip] #5.1
    sub       r11, rax                                      #5.1
    lea       rax, QWORD PTR [175+rsp]                      #5.1
    jmp       r11                                           #5.1
    movaps    XMMWORD PTR [-15+rax], xmm7                   #5.1
    movaps    XMMWORD PTR [-31+rax], xmm6                   #5.1
    movaps    XMMWORD PTR [-47+rax], xmm5                   #5.1
    movaps    XMMWORD PTR [-63+rax], xmm4                   #5.1
    movaps    XMMWORD PTR [-79+rax], xmm3                   #5.1
    movaps    XMMWORD PTR [-95+rax], xmm2                   #5.1
    movaps    XMMWORD PTR [-111+rax], xmm1                  #5.1
    movaps    XMMWORD PTR [-127+rax], xmm0                  #5.1
..___tag_value_varstrings(int, ...).6: 

It's essentially a Duff's device. The r11 register is loaded with the address after the xmm saving instructions, and then al*4 is subtracted from the result (since movaps XMMWORD PTR [rax-X], xmmX is 4 bytes long) to jump to the movaps instruction that we should run

As I see, other compilers always save all the vector registers, or don't save them at all, so they don't care about al's value and just check if it's zero

The general purpose registers are always saved, probably because it's cheaper to just move the 6 registers to memory instead of spending time for a condition check, address calculation and jump. As a result so you don't need a parameter for how many integers were passed in registers

Here is a similar question to yours. You can find more information in the below links

  • How do vararg functions find out the number of arguments in machine code?
  • Why is %eax zeroed before a call to printf?
  • Identifying variable args function
like image 116
phuclv Avatar answered Apr 02 '23 14:04

phuclv