While trying to make some old code work again (https://github.com/chaos4ever/chaos/blob/master/libraries/system/system_calls.h#L387, FWIW) I discovered that some of the semantics of gcc
seem to have changed in a quite subtle but still dangerous way during the latest 10-15 years... :P
The code used to work well with older versions of gcc
, like 2.95. Anyway, here is the code:
static inline return_type system_call_service_get(const char *protocol_name, service_parameter_type *service_parameter,
tag_type *identification)
{
return_type return_value;
asm volatile("pushl %2\n"
"pushl %3\n"
"pushl %4\n"
"lcall %5, $0"
: "=a" (return_value),
"=g" (*service_parameter)
: "g" (identification),
"g" (service_parameter),
"g" (protocol_name),
"n" (SYSTEM_CALL_SERVICE_GET << 3));
return return_value;
}
The problem with the code above is that gcc
(4.7 in my case) will compile this to the following asm code (AT&T syntax):
# 392 "../system/system_calls.h" 1
pushl 68(%esp) # This pointer (%esp + 0x68) is valid when the inline asm is entered.
pushl %eax
pushl 48(%esp) # ...but this one is not (%esp + 0x48), since two dwords have now been pushed onto the stack, so %esp is not what the compiler expects it to be
lcall $456, $0
# Restoration of %esp at this point is done in the called method (i.e. lret $12)
The problem: The variables (identification
and protocol_name
) are on the stack in the calling context. So gcc
(with optimizations turned out, unsure if it matters) will just get the values from there and hand it over to the inline asm section. But since I'm pushing stuff on the stack, the offsets that gcc
calculate will be off by 8 in the third call (pushl 48(%esp)
). :)
This took me a long time to figure out, it wasn't all obvious to me at first.
The easiest way around this is of course to use the r
input constraint, to ensure that the value is in a register instead. But is there another, better way? One obvious way would of course be to rewrite the whole system call interface to not push stuff on the stack in the first place (and use registers instead, like e.g. Linux), but that's not a refactoring I feel like doing tonight...
Is there any way to tell gcc
inline asm that "the stack is volatile"? How have you guys been handling stuff like this in the past?
Update later the same evening: I did found a relevant gcc
ML thread (https://gcc.gnu.org/ml/gcc-help/2011-06/msg00206.html) but it didn't seem to help. It seems like specifying %esp
in the clobber list should make it do offsets from %ebp
instead, but it doesn't work and I suspect the -O2 -fomit-frame-pointer
has an effect here. I have both of these flags enabled.
GCC provides two forms of inline asm statements. A basic asm statement is one with no operands (see Basic Asm), while an extended asm statement (see Extended Asm) includes one or more operands.
In computer programming, an inline assembler is a feature of some compilers that allows low-level code written in assembly language to be embedded within a program, among code that otherwise has been compiled from a higher-level language such as C or Ada.
Clobbers. A comma-separated list of registers or other values changed by the AssemblerTemplate , beyond those listed as outputs. An empty list is permitted.
asm volatile ("" ::: "memory") AFAIK is the same as the previous. The volatile keyword tells the compiler that it's not allowed to move this assembly block. For example, it may be hoisted out of a loop if the compiler decides that the input values are the same in every invocation.
What works and what doesn't:
I tried omitting -fomit-frame-pointer
. No effect whatsoever. I included %esp
, esp
and sp
in the list of clobbers.
I tried omitting -fomit-frame-pointer
and -O3
. This actually produces code that works, since it relies on %ebp
rather than %esp
.
pushl 16(%ebp)
pushl 12(%ebp)
pushl 8(%ebp)
lcall $456, $0
I tried with just having -O3
and not -fomit-frame-pointer
specified in my command line. Creates bad, broken code (relies on %esp
being constant within the whole assembly block, i.e. no stack frame).
I tried with skipping -fomit-frame-pointer
and just using -O2
. Broken code, no stack frame.
I tried with just using -O1
. Broken code, no stack frame.
I tried adding cc
as clobber. No can do, doesn't make any difference whatsoever.
I tried changing the input constraints to ri
, giving the input & output code below. This of course works but is slightly less elegant than I had hoped. Then again, perfect is the enemy of good so maybe I will have to live with this for now.
Input C code:
static inline return_type system_call_service_get(const char *protocol_name, service_parameter_type *service_parameter,
tag_type *identification)
{
return_type return_value;
asm volatile("pushl %2\n"
"pushl %3\n"
"pushl %4\n"
"lcall %5, $0"
: "=a" (return_value),
"=g" (*service_parameter)
: "ri" (identification),
"ri" (service_parameter),
"ri" (protocol_name),
"n" (SYSTEM_CALL_SERVICE_GET << 3));
return return_value;
}
Output asm code. As can be seen, using registers instead which should always be safe (but maybe somewhat less performant since the compiler has to move stuff around):
#APP
# 392 "../system/system_calls.h" 1
pushl %esi
pushl %eax
pushl %ebx
lcall $456, $0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With