Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why can't I get the value of asm registers in C?

I'm trying to get the values of the assembly registers rdi, rsi, rdx, rcx, r8, but I'm getting the wrong value, so I don't know if what I'm doing is taking those values or telling the compiler to write on these registers, and if that's the case how could I achieve what I'm trying to do (Put the value of assembly registers in C variables)?

When this code compiles (with gcc -S test.c)

#include <stdio.h>

void    beautiful_function(int a, int b, int c, int d, int e) {
    register long   rdi asm("rdi");
    register long   rsi asm("rsi");
    register long   rdx asm("rdx");
    register long   rcx asm("rcx");
    register long   r8 asm("r8");

    const long      save_rdi = rdi;
    const long      save_rsi = rsi;
    const long      save_rdx = rdx;
    const long      save_rcx = rcx;
    const long      save_r8 = r8;
    printf("%ld\n%ld\n%ld\n%ld\n%ld\n", save_rdi, save_rsi, save_rdx, save_rcx, save_r8);
}

int main(void) {
    beautiful_function(1, 2, 3, 4, 5);
}

it outputs the following assembly code (before the function call):

    movl    $1, %edi
    movl    $2, %esi
    movl    $3, %edx
    movl    $4, %ecx
    movl    $5, %r8d
    callq   _beautiful_function

When I compile and execute it outputs this:

0
0
4294967296
140732705630496
140732705630520
(some undefined values)

What did I do wrong ? and how could I do this?

like image 508
Fayeure Avatar asked Jun 06 '21 22:06

Fayeure


People also ask

What is __ asm __ in C?

The __asm keyword invokes the inline assembler and can appear wherever a C or C++ statement is legal. It cannot appear by itself. It must be followed by an assembly instruction, a group of instructions enclosed in braces, or, at the very least, an empty pair of braces.

What is __ asm __ volatile?

The __volatile__ modifier on an __asm__ block forces the compiler's optimizer to execute the code as-is. Without it, the optimizer may think it can be either removed outright, or lifted out of a loop and cached.

Is asm a keyword in C?

The asm keyword allows you to embed assembler instructions within C code. GCC provides two forms of inline asm statements. A basic asm statement is one with no operands (see Basic Asm), while an extended asm statement (see Extended Asm) includes one or more operands.

What does R mean in asm?

Now you should get an idea what r(y) is: It is an input operand that reserves a register for the variable y and assigns it to the placeholder %1 (because it is the second operand listed after the inline assembler string).

Can I use an ASM variable in a function?

(And note that asm constraints are the only supported use of register asm variables; there's no guarantee that the variable's value will be in that register any other time. There's nothing to stop the compiler from placing these asm statements anywhere it wants within a function (or parent functions after inlining).

What are the Register constraints for asm statements?

The compiler will treat your asm statement as if it actually wrote that register, so if it needs the value for later, it will have copied it to another register (or spilled to memory) when your asm statement "runs". The other x86 register constraints are b (bl/bx/ebx/rbx), c (.../rcx), d (.../rdx), S (sil/si/esi/rsi), D (.../rdi).

Is it safe to use a local register-ASM variable with GCC?

(It still happens to work with GCC, at least in simple cases.) An asm statement that does a mov %%reg, %0 to an "=r" (var) output is safe, too, that answer is fine. Editor's note: this way of using a local register-asm variable is now documented by GCC as "not supported". It still usually happens to work on GCC, but breaks with clang.

How to get the value of I after inline asm statement?

Note that clang will typically pick memory if you use "=rm", even if it actually needs the value in a register. It will end up storing and reloading. This is a longstanding missed-optimization in clang's inline asm support. Using "=b" (i) should also work, just telling the compiler that the EBX holds the value of i after the asm statement.


2 Answers

Your code didn't work because Specifying Registers for Local Variables explicitly tells you not to do what you did:

The only supported use for this feature is to specify registers for input and output operands when calling Extended asm (see Extended Asm).

Other than when invoking the Extended asm, the contents of the specified register are not guaranteed. For this reason, the following uses are explicitly not supported. If they appear to work, it is only happenstance, and may stop working as intended due to (seemingly) unrelated changes in surrounding code, or even minor changes in the optimization of a future version of gcc:

  • Passing parameters to or from Basic asm
  • Passing parameters to or from Extended asm without using input or output operands.
  • Passing parameters to or from routines written in assembler (or other languages) using non-standard calling conventions.

To put the value of registers in variables, you can use Extended asm, like this:

long rdi, rsi, rdx, rcx;
register long r8 asm("r8");
asm("" : "=D"(rdi), "=S"(rsi), "=d"(rdx), "=c"(rcx), "=r"(r8));

But note that even this might not do what you want: the compiler is within its rights to copy the function's parameters elsewhere and reuse the registers for something different before your Extended asm runs, or even to not pass the parameters at all if you never read them through the normal C variables. (And indeed, even what I posted doesn't work when optimizations are enabled.) You should strongly consider just writing your whole function in assembly instead of inline assembly inside of a C function if you want to do what you're doing.

like image 162
Joseph Sible-Reinstate Monica Avatar answered Oct 29 '22 03:10

Joseph Sible-Reinstate Monica


Even if you had a valid way of doing this (which this isn't), it probably only makes sense at the top of a function which isn't inlined. So you'd probably need __attribute__((noinline, noclone)). (noclone is a GCC attribute that clang will warn about not recognizing; it means not to make an alternate version of the function with fewer actual args, to be called in the case where some of them are known constants that can get propagated into the clone.)

register ... asm local vars aren't guaranteed to do anything except when used as operands to Extended Asm statements. GCC does sometimes still read the named register if you leave it uninitialized, but clang doesn't. (And it looks like you're on a Mac, where the gcc command is actually clang, because so many build scripts use gcc instead of cc.)

So even without optimization, the stand-alone non-inlined version of your beautiful_function is just reading uninitialized stack space when it reads your rdi C variable in const long save_rdi = rdi;. (GCC does happen to do what you wanted here, even at -Os - optimizes but chooses not to inline your function. See clang and GCC (targeting Linux) on Godbolt, with asm + program output.).


Using an asm statement to make register asm do something

(This does what you say you want (reading registers), but because of other optimizations, still doesn't produce 1 2 3 4 5 with clang when the caller can see the definition. Only with actual GCC. There might be a clang option to disable some relevant IPA / IPO optimization, but I didn't find one.)

You can use an asm volatile() statement with an empty template string to tell the compiler that the values in those registers are now the values of those C variables. (The register ... asm declarations force it to pick the right register for the right variable)

#include <stdlib.h> 
#include <stdio.h>

__attribute__((noinline,noclone))
void    beautiful_function(int a, int b, int c, int d, int e) {
    register long   rdi asm("rdi");
    register long   rsi asm("rsi");
    register long   rdx asm("rdx");
    register long   rcx asm("rcx");
    register long   r8 asm("r8");

    // "activate" the register-asm locals:
    // associate register values with C vars here, at this point
   asm volatile("nop  # asm statement here"        // can be empty, nop is just because Godbolt filters asm comments
       : "=r"(rdi), "=r"(rsi), "=r"(rdx), "=r"(rcx), "=r"(r8) );

    const long      save_rdi = rdi;
    const long      save_rsi = rsi;
    const long      save_rdx = rdx;
    const long      save_rcx = rcx;
    const long      save_r8 = r8;
    printf("%ld\n%ld\n%ld\n%ld\n%ld\n", save_rdi, save_rsi, save_rdx, save_rcx, save_r8);
}

int main(void) {
    beautiful_function(1, 2, 3, 4, 5);
}

This makes asm in your beautiful_function that does capture the incoming values of your registers. (It doesn't inline, and the compiler happens not to have used any instructions before the asm statement that steps on any of those registers. The latter is not guaranteed in general.)

On Godbolt with clang -O3 and gcc -O3

gcc -O3 does actually work, printing what you expect. clang still prints garbage, because the caller sees that the args are unused, and decides not to set those registers. (If you'd hidden the definition from the caller, e.g. in another file without LTO, that wouldn't happen.)

(With GCC, noninline,noclone attributes are enough to disable this inter-procedural optimization, but not with clang. Not even compiling with -fPIC makes that possible. I guess the idea is that symbol-interposition to provide an alternate definition of beautiful_function that does use its args would violate the one definition rule in C. So if clang can see a definition for a function, it assumes that's how the function works, even if it isn't allowed to actually inline it.)

With clang:

main:
        pushq   %rax          # align the stack
     # arg-passing optimized away
        callq   beautiful_function@PLT
    # indirect through the PLT because I compiled for Linux with -fPIC, 
    # and the function isn't "static"
        xorl    %eax, %eax
        popq    %rcx
        retq

But the actual definition for beautiful_function does exactly what you want:

# clang -O3
beautiful_function:
        pushq   %r14
        pushq   %rbx
        nop     # asm statement here
        movq    %rdi, %r9             # copying all 5 register outputs to different regs
        movq    %rsi, %r10
        movq    %rdx, %r11
        movq    %rcx, %rbx
        movq    %r8, %r14
        leaq    .L.str(%rip), %rdi
        xorl    %eax, %eax
        movq    %r9, %rsi                # then copying them to printf args
        movq    %r10, %rdx
        movq    %r11, %rcx
        movq    %rbx, %r8
        movq    %r14, %r9
        popq    %rbx
        popq    %r14
        jmp     printf@PLT              # TAILCALL

GCC wastes fewer instructions, just for example starting with movq %r8, %r9 to move your r8 C var as the 6th arg to printf. Then movq %rcx, %r8 to set up the 5th arg, overwriting one of the output registers before it's read all of them. Something clang was over-cautious about. However, clang does still push/pop %r12 around the asm statement; I don't understand why. It ends by tailcalling printf, so it wasn't for alignment.


Related:

  • How to specify a specific register to assign the result of a C expression in inline assembly in GCC? - the opposite problem: materialize a C variable value in a specific register at a certain point.
  • Reading a register value into a C variable - the previous canonical Q&A which uses the now-unsupported register ... asm("regname") method like you were trying to. Or with a register-asm global variable, which hurts efficiency of all your code by leaving it otherwise untouched.
  • I forgot I'd answered that Q&A, making basically the same points as this. And some other points, e.g. that this doesn't work on registers like the stack pointer.
like image 20
Peter Cordes Avatar answered Oct 29 '22 04:10

Peter Cordes