This is basically to perform swap for the buffers while transferring a message buffer. This statement left me puzzled (because of my unfamiliarity with the embedded assembly code in c). This is a power pc instruction
#define ASMSWAP32(dest_addr,data) __asm__ volatile ("stwbrx %0, 0, %1" : : "r" (data), "r" (dest_addr))
Besides being unsafe because of a bug, this macro is also less efficient than what the compiler will generate for you.
stwbrx
= store word byte-reversed. The x
stands for indexed.
You don't need inline asm for this in GNU C, where you can use __builtin_bswap32
and let the compiler emit this instruction for you.
void swapstore_asm(int a, int *p) {
ASMSWAP32(p, a);
}
void swapstore_c(int a, int *p) {
*p = __builtin_bswap32(a);
}
Compiled with gcc4.8.5 -O3 -mregnames
, we get identical code from both functions (Godbolt compiler explorer):
swapstore:
stwbrx %r3, 0, %r4
blr
swapstore_c:
stwbrx %r3,0,%r4
blr
But with a more complicated address (storing to p[off]
, where off
is an integer function arg), the compiler knows how to use both register inputs, while your macro forces the compiler to have the address in a single register:
void swapstore_offset(int a, int *p, int off) {
= __builtin_bswap32(a);
}
swapstore_offset:
slwi %r5,%r5,2 # *4 = sizeof(int)
stwbrx %r3,%r4,%r5 # use an indexed addressing mode, with both registers non-zero
blr
swapstore_offset_asm:
slwi %r5,%r5,2
add %r4,%r4,%r5 # extra instruction forced by using the macro
stwbrx %r3, 0, %r4
blr
BTW, if you're having trouble understanding GNU C inline asm templates, looking at the compiler's asm output can be a useful way to see what gets substituted in. See How to remove "noise" from GCC/clang assembly output? for more about reading compiler asm output.
Also note that this macro is buggy: it's missing a "memory"
clobber for the store. And yes, you still need that with asm volatile
. The compiler doesn't assume that *dest_addr
is modified unless you tell it, so it could hoist a non-volatile load of *dest_addr
ahead of this insn, or more likely to be a real problem, sink a store after it. (e.g. if you zeroed a buffer before storing to it with this, the compiler might actually zero after this instruction.)
Instead of a "memory"
clobber (and also leaving out volatile
), you could tell the compiler which memory location you modify with a =m" (*dest_addr)
operand, either as a dummy operand or with a constraint on the addressing mode so you could use it as reg+reg
. (IDK PPC well enough to know what "=m"
usually expands to.)
In most cases this bug won't bite you, but it's still a bug. Upgrading your compiler version or using link-time optimization could maybe make your program buggy with no source-level changes.
This kind of thing is why https://gcc.gnu.org/wiki/DontUseInlineAsm
See also https://stackoverflow.com/tags/inline-assembly/info.
#define ASMSWAP32(dest_addr,data)
...
This part should be clear
__asm__ volatile (
...: : "r" (data), "r" (dest_addr))
This is the actual inline assembly:
Two values are passed to the assmbly code; no value is returned from the assembly code (this is the colons after the actual assembly code).
Both parameters are passed in registers ("r"
). The expression %0
will be replaced by the register that contains the value of data
while the expression %1
will be replaced by the register that contains the value of dest_addr
(which will be a pointer in this case).
The volatile
here means that the assembly code has to be executed at this point and cannot be moved to somewhere else.
So if you use the following code in the C source:
ASMSWAP(&a, b);
... the following assembler code will be generated:
# write the address of a to register 5 (for example)
...
# write the value of b to register 6
...
stwbrx 6, 0, 5
So the first argument of the stwbrx
instruction is the value of b
and the last argument is the address of a
.
stwbrx x, 0, y
This instruction writes the value in register x
to the address stored in register y
; however it writes the value in "reverse endian" (on a big-endian CPU it writes the value "little endian".
The following code:
uint32 a;
ASMSWAP32(&a, 0x12345678);
... should therefore result in a = 0x78563412
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With