Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

gcc removes inline assembler code

It seems like gcc 4.6.2 removes code it considers unused from functions.

test.c

int main(void) {
  goto exit;
  handler:
    __asm__ __volatile__("jmp 0x0");
  exit:
  return 0;
}

Disassembly of main()

   0x08048404 <+0>:     push   ebp
   0x08048405 <+1>:     mov    ebp,esp
   0x08048407 <+3>:     nop    # <-- This is all whats left of my jmp.
   0x08048408 <+4>:     mov    eax,0x0
   0x0804840d <+9>:     pop    ebp
   0x0804840e <+10>:    ret

Compiler options

No optimizations enabled, just gcc -m32 -o test test.c (-m32 because I'm on a 64 bit machine).

How can I stop this behavior?

Edit: Preferably by using compiler options, not by modifing the code.

like image 764
iblue Avatar asked Jun 13 '12 13:06

iblue


People also ask

Does GCC support inline assembly?

GCC provides two forms of inline asm statements. A basic asm statement is one with no operands (see Basic Asm), while an extended asm statement (see Extended Asm) includes one or more operands.

What is inline assembly code?

In computer programming, an inline assembler is a feature of some compilers that allows low-level code written in assembly language to be embedded within a program, among code that otherwise has been compiled from a higher-level language such as C or Ada.

What is __ asm __ in C?

The __asm keyword invokes the inline assembler and can appear wherever a C or C++ statement is legal. It cannot appear by itself. It must be followed by an assembly instruction, a group of instructions enclosed in braces, or, at the very least, an empty pair of braces.

Can you write assembly code in C?

We can write assembly program code inside c language program. In such case, all the assembly code must be placed inside asm{} block.


4 Answers

Looks like that's just the way it is - When gcc sees that code within a function is unreachable, it removes it. Other compilers might be different.
In gcc, an early phase in compilation is building the "control flow graph" - a graph of "basic blocks", each free of conditions, connected by branches. When emitting the actual code, parts of the graph, which are not reachable from the root, are discarded.
This isn't part of the optimization phase, and is therefore unaffected by compilation options.

So any solution would involve making gcc think that the code is reachable.

My suggestion:

Instead of putting your assembly code in an unreachable place (where GCC may remove it), you can put it in a reachable place, and skip over the problematic instruction:

int main(void) {
     goto exit;

     exit:
     __asm__ __volatile__ (
        "jmp 1f\n"
        "jmp $0x0\n"
        "1:\n"
    );
    return 0;
}

Also, see this thread about the issue.

like image 72
ugoren Avatar answered Oct 22 '22 10:10

ugoren


I do not believe there is a reliable way using just compile options to solve this. The preferable mechanism is something that will do the job and work on future versions of the compiler regardless of the options used to compile.


Commentary about Accepted Answer

In the accepted answer there is an edit to the original that suggests this solution:

int main(void) {
  __asm__ ("jmp exit");

  handler:
      __asm__ __volatile__("jmp $0x0");
  exit:
  return 0;
}

First off jmp $0x0 should be jmp 0x0. Secondly C labels usually get translated into local labels. jmp exit doesn't actually jump to the label exit in the C function, it jumps to the exit function in the C library effectively bypassing the return 0 at the bottom of main. Using Godbolt with GCC 4.6.4 we get this non-optimized output (I have trimmed the labels we don't care about):

main:
        pushl   %ebp
        movl    %esp, %ebp
        jmp exit
        jmp 0x0
.L3:
        movl    $0, %eax
        popl    %ebp
        ret

.L3 is actually the local label for exit. You won't find the exit label in the generated assembly. It may compile and link if the C library is present. Do not use C local goto labels in inline assembly like this.


Use asm goto as the Solution

As of GCC 4.5 (OP is using 4.6.x) there is support for asm goto extended assembly templates. asm goto allows you to specify jump targets that the inline assembly may use:

6.45.2.7 Goto Labels

asm goto allows assembly code to jump to one or more C labels. The GotoLabels section in an asm goto statement contains a comma-separated list of all C labels to which the assembler code may jump. GCC assumes that asm execution falls through to the next statement (if this is not the case, consider using the __builtin_unreachable intrinsic after the asm statement). Optimization of asm goto may be improved by using the hot and cold label attributes (see Label Attributes).

An asm goto statement cannot have outputs. This is due to an internal restriction of the compiler: control transfer instructions cannot have outputs. If the assembler code does modify anything, use the "memory" clobber to force the optimizers to flush all register values to memory and reload them if necessary after the asm statement.

Also note that an asm goto statement is always implicitly considered volatile.

To reference a label in the assembler template, prefix it with ‘%l’ (lowercase ‘L’) followed by its (zero-based) position in GotoLabels plus the number of input operands. For example, if the asm has three inputs and references two labels, refer to the first label as ‘%l3’ and the second as ‘%l4’).

Alternately, you can reference labels using the actual C label name enclosed in brackets. For example, to reference a label named carry, you can use ‘%l[carry]’. The label must still be listed in the GotoLabels section when using this approach.

The code could be written this way:

int main(void) {
  __asm__ goto ("jmp %l[exit]" :::: exit);
  handler:
      __asm__ __volatile__("jmp 0x0");
  exit:
  return 0;
}

We can use asm goto. I prefer __asm__ over asm since it will not throw warnings if compiling with -ansi or -std=? options. After the clobbers you can list the jump targets the inline assembly may use. C doesn't actually know if we jump or not as GCC doesn't analyze the actual code in the inline assembly template. It can't remove this jump, nor can it assume what comes after is dead code. Using Godbolt with GCC 4.6.4 the unoptimized code (trimmed) looks like:

main:
        pushl   %ebp
        movl    %esp, %ebp
        jmp .L2                   # <------ this is the goto exit
        jmp 0x0
.L2:                              # <------ exit label
        movl    $0, %eax
        popl    %ebp
        ret

The Godbolt with GCC 4.6.4 output still looks correct and appears as:

main:
        jmp .L2                   # <------ this is the goto exit
        jmp 0x0
.L2:                              # <------ exit label
        xorl    %eax, %eax
        ret

This mechanism should also work whether you have optimizations on or off, and shouldn't matter whether you are compiling for 64-bit or 32-bit x86 targets.


Other Observations

  • When there are no output constraints in an extended inline assembly template the asm statement is implicitly volatile. The line

    __asm__ __volatile__("jmp 0x0");
    

    Can be written as:

    __asm__ ("jmp 0x0");
    
  • asm goto statements are considered implicitly volatile. They don't require a volatile modifier either.

like image 37
Michael Petch Avatar answered Oct 22 '22 12:10

Michael Petch


Would this work, make it so gcc can't know its unreachable

int main(void)  
{ 
    volatile int y = 1;
    if (y) goto exit;
handler:
    __asm__ __volatile__("jmp 0x0");  
exit:   
    return 0; 
}
like image 4
8bitwide Avatar answered Oct 22 '22 11:10

8bitwide


If a compiler thinks it can cheat you, just cheat back: (GCC only)

int main(void) {
    {
        /* Place this code anywhere in the same function, where
         * control flow is known to still be active (such as at the start) */
        extern volatile unsigned int some_undefined_symbol;
        __asm__ __volatile__(".pushsection .discard" : : : "memory");
        if (some_undefined_symbol) goto handler;
        __asm__ __volatile__(".popsection" : : : "memory");
    }
    goto exit;
handler:
    __asm__ __volatile__("jmp 0x0");
    exit:
    return 0;
}

This solution will not add any additional overhead for meaningless instructions, though only works for GCC when used with AS (as is the default).

Explaination: .pushsection switches text output of the compiler to another section, in this case .discard (which is deleted during linking by default). The "memory" clobber prevents GCC from trying to move other text within the section that will be discarded. However, GCC doesn't realize (and never could because the __asm__s are __volatile__) that anything happening between the 2 statements will be discarded.

As for some_undefined_symbol, that is literally just any symbol that is never being defined (or is actually defined, it shouldn't matter). And since the section of code using it will be discarded during linking, it won't produce any unresolved-reference errors either.

Finally, the conditional jump to the label you want to make appear as though it was reachable does exactly that. Besides that fact that it won't appear in the output binary at all, GCC realizes that it can't know anything about some_undefined_symbol, meaning it has no choice but to assume that both of the if's branches are reachable, meaning that as far as it is concerned, control flow can continue both by reaching goto exit, or by jumping to handler (even though there won't be any code that could even do this)

However, be careful when enabling garbage collection in your linker ld --gc-sections (it's disabled by default), because otherwise it might get the idea to get rid of the still unused label regardless.

EDIT: Forget all that. Just do this:

int main(void) {
    __asm__ __volatile__ goto("" : : : : handler);
    goto exit;
handler:
    __asm__ __volatile__("jmp 0x0");
exit:
    return 0;
}
like image 2
user3296587 Avatar answered Oct 22 '22 12:10

user3296587