It seems like gcc 4.6.2 removes code it considers unused from functions.
int main(void) {
goto exit;
handler:
__asm__ __volatile__("jmp 0x0");
exit:
return 0;
}
main()
0x08048404 <+0>: push ebp
0x08048405 <+1>: mov ebp,esp
0x08048407 <+3>: nop # <-- This is all whats left of my jmp.
0x08048408 <+4>: mov eax,0x0
0x0804840d <+9>: pop ebp
0x0804840e <+10>: ret
No optimizations enabled, just gcc -m32 -o test test.c
(-m32
because I'm on a 64 bit machine).
How can I stop this behavior?
Edit: Preferably by using compiler options, not by modifing the code.
GCC provides two forms of inline asm statements. A basic asm statement is one with no operands (see Basic Asm), while an extended asm statement (see Extended Asm) includes one or more operands.
In computer programming, an inline assembler is a feature of some compilers that allows low-level code written in assembly language to be embedded within a program, among code that otherwise has been compiled from a higher-level language such as C or Ada.
The __asm keyword invokes the inline assembler and can appear wherever a C or C++ statement is legal. It cannot appear by itself. It must be followed by an assembly instruction, a group of instructions enclosed in braces, or, at the very least, an empty pair of braces.
We can write assembly program code inside c language program. In such case, all the assembly code must be placed inside asm{} block.
Looks like that's just the way it is - When gcc
sees that code within a function is unreachable, it removes it. Other compilers might be different.
In gcc
, an early phase in compilation is building the "control flow graph" - a graph of "basic blocks", each free of conditions, connected by branches. When emitting the actual code, parts of the graph, which are not reachable from the root, are discarded.
This isn't part of the optimization phase, and is therefore unaffected by compilation options.
So any solution would involve making gcc
think that the code is reachable.
My suggestion:
Instead of putting your assembly code in an unreachable place (where GCC may remove it), you can put it in a reachable place, and skip over the problematic instruction:
int main(void) {
goto exit;
exit:
__asm__ __volatile__ (
"jmp 1f\n"
"jmp $0x0\n"
"1:\n"
);
return 0;
}
Also, see this thread about the issue.
I do not believe there is a reliable way using just compile options to solve this. The preferable mechanism is something that will do the job and work on future versions of the compiler regardless of the options used to compile.
In the accepted answer there is an edit to the original that suggests this solution:
int main(void) {
__asm__ ("jmp exit");
handler:
__asm__ __volatile__("jmp $0x0");
exit:
return 0;
}
First off jmp $0x0
should be jmp 0x0
. Secondly C labels usually get translated into local labels. jmp exit
doesn't actually jump to the label exit
in the C function, it jumps to the exit
function in the C library effectively bypassing the return 0
at the bottom of main
. Using Godbolt with GCC 4.6.4 we get this non-optimized output (I have trimmed the labels we don't care about):
main:
pushl %ebp
movl %esp, %ebp
jmp exit
jmp 0x0
.L3:
movl $0, %eax
popl %ebp
ret
.L3
is actually the local label for exit
. You won't find the exit
label in the generated assembly. It may compile and link if the C library is present. Do not use C local goto labels in inline assembly like this.
As of GCC 4.5 (OP is using 4.6.x) there is support for asm goto
extended assembly templates. asm goto
allows you to specify jump targets that the inline assembly may use:
6.45.2.7 Goto Labels
asm goto allows assembly code to jump to one or more C labels. The GotoLabels section in an asm goto statement contains a comma-separated list of all C labels to which the assembler code may jump. GCC assumes that asm execution falls through to the next statement (if this is not the case, consider using the __builtin_unreachable intrinsic after the asm statement). Optimization of asm goto may be improved by using the hot and cold label attributes (see Label Attributes).
An asm goto statement cannot have outputs. This is due to an internal restriction of the compiler: control transfer instructions cannot have outputs. If the assembler code does modify anything, use the "memory" clobber to force the optimizers to flush all register values to memory and reload them if necessary after the asm statement.
Also note that an asm goto statement is always implicitly considered volatile.
To reference a label in the assembler template, prefix it with ‘%l’ (lowercase ‘L’) followed by its (zero-based) position in GotoLabels plus the number of input operands. For example, if the asm has three inputs and references two labels, refer to the first label as ‘%l3’ and the second as ‘%l4’).
Alternately, you can reference labels using the actual C label name enclosed in brackets. For example, to reference a label named carry, you can use ‘%l[carry]’. The label must still be listed in the GotoLabels section when using this approach.
The code could be written this way:
int main(void) {
__asm__ goto ("jmp %l[exit]" :::: exit);
handler:
__asm__ __volatile__("jmp 0x0");
exit:
return 0;
}
We can use asm goto
. I prefer __asm__
over asm
since it will not throw warnings if compiling with -ansi
or -std=?
options.
After the clobbers you can list the jump targets the inline assembly may use. C doesn't actually know if we jump or not as GCC doesn't analyze the actual code in the inline assembly template. It can't remove this jump, nor can it assume what comes after is dead code. Using Godbolt with GCC 4.6.4 the unoptimized code (trimmed) looks like:
main:
pushl %ebp
movl %esp, %ebp
jmp .L2 # <------ this is the goto exit
jmp 0x0
.L2: # <------ exit label
movl $0, %eax
popl %ebp
ret
The Godbolt with GCC 4.6.4 output still looks correct and appears as:
main:
jmp .L2 # <------ this is the goto exit
jmp 0x0
.L2: # <------ exit label
xorl %eax, %eax
ret
This mechanism should also work whether you have optimizations on or off, and shouldn't matter whether you are compiling for 64-bit or 32-bit x86 targets.
When there are no output constraints in an extended inline assembly template the asm
statement is implicitly volatile. The line
__asm__ __volatile__("jmp 0x0");
Can be written as:
__asm__ ("jmp 0x0");
asm goto
statements are considered implicitly volatile. They don't require a volatile
modifier either.
Would this work, make it so gcc can't know its unreachable
int main(void)
{
volatile int y = 1;
if (y) goto exit;
handler:
__asm__ __volatile__("jmp 0x0");
exit:
return 0;
}
If a compiler thinks it can cheat you, just cheat back: (GCC only)
int main(void) {
{
/* Place this code anywhere in the same function, where
* control flow is known to still be active (such as at the start) */
extern volatile unsigned int some_undefined_symbol;
__asm__ __volatile__(".pushsection .discard" : : : "memory");
if (some_undefined_symbol) goto handler;
__asm__ __volatile__(".popsection" : : : "memory");
}
goto exit;
handler:
__asm__ __volatile__("jmp 0x0");
exit:
return 0;
}
This solution will not add any additional overhead for meaningless instructions, though only works for GCC when used with AS (as is the default).
Explaination: .pushsection
switches text output of the compiler to another section, in this case .discard
(which is deleted during linking by default). The "memory"
clobber prevents GCC from trying to move other text within the section that will be discarded. However, GCC doesn't realize (and never could because the __asm__
s are __volatile__
) that anything happening between the 2 statements will be discarded.
As for some_undefined_symbol
, that is literally just any symbol that is never being defined (or is actually defined, it shouldn't matter). And since the section of code using it will be discarded during linking, it won't produce any unresolved-reference errors either.
Finally, the conditional jump to the label you want to make appear as though it was reachable does exactly that. Besides that fact that it won't appear in the output binary at all, GCC realizes that it can't know anything about some_undefined_symbol
, meaning it has no choice but to assume that both of the if's branches are reachable, meaning that as far as it is concerned, control flow can continue both by reaching goto exit
, or by jumping to handler
(even though there won't be any code that could even do this)
However, be careful when enabling garbage collection in your linker ld --gc-sections
(it's disabled by default), because otherwise it might get the idea to get rid of the still unused label regardless.
EDIT: Forget all that. Just do this:
int main(void) {
__asm__ __volatile__ goto("" : : : : handler);
goto exit;
handler:
__asm__ __volatile__("jmp 0x0");
exit:
return 0;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With