I'm writing code targeting ARM Cortex-A on Android devices (using GNU assembler and compiler), and I'm trying to interface between Assembly and C. In particular, I'm interested in calling functions written in C from Assembly. I tried many things, including the .extern
directive, declaring C functions with asm
and __asm__
and so on, but none of them worked, so I'm looking for a minimal example of doing so. A reference to such example would be just as welcome.
To call an external function, such as NetRun's "print_int", or a standard C library function like "exit", you need to tell the assembler the function is "extern". "extern" isn't actually an instruction--it doesn't show up in the disassembly--it's just a message to the assembler, often called a pseudoinstruction.
We can write assembly program code inside c language program. In such case, all the assembly code must be placed inside asm{} block. Let's see a simple assembly program code to add two numbers in c program.
To summarize: When calling a C function, the registers r0-r3,r12 (and maybe r9) need to be saved. From my experience, gcc uses r12 as a scratch register inside a function and hence it is not callee-saved even if arm/thumb-interworking is not used.
You need to read the ARM ARM and/or know the instruction set is all, normally you would want to do something like this
asm:
bl cfun
c:
void cfun ( void )
{
}
You can try this yourself. for gnu as and gcc this works just fine it should also work just fine if you use clang to get the c code to an object and gnu as for assembler. Not sure what you are using.
The problem with the above is bl has a limited reach,
if ConditionPassed(cond) then
if L == 1 then
LR = address of the instruction after the branch instruction
PC = PC + (SignExtend_30(signed_immed_24) << 2)
knowing that the bl instruction sets the link register to the instruction after the bl instruction, then if you read about the program counter register:
For an ARM instruction, the value read is the address of the instruction
plus 8 bytes. Bits [1:0] of this
value are always zero, because ARM instructions are always word-aligned.
so if you make your asm look like this:
mov lr,pc
ldr pc,=cfun
you get
d6008034: e1a0e00f mov lr, pc
d6008038: e51ff000 ldr pc, [pc, #-0] ; d6008040
...
d6008040: d60084c4 strle r8, [r0], -r4, asr #9
The assembler will reserve a memory location, within reach of the ldr pc, instruction (if possible, otherwise generate an error) where it will place the full 32 bit address for the instruction. the linker will later fill in this address with the external address. that way you can reach any address in the address space.
if you dont want to play assembler games like that and want to be in control then you create the location to keep the address of the function and load it into the pc yourself:
mov lr,pc
ldr pc,cfun_addr
...
cfun_addr:
.word cfun
compiled:
d6008034: e1a0e00f mov lr, pc
d6008038: e51ff000 ldr pc, [pc, #-0] ; d6008040 <cfun_addr>
...
d6008040 <cfun_addr>:
d6008040: d60084c4 strle r8, [r0], -r4, asr #9
Lastly if you want to move into the modern ARM world where ARM and thumb is mixed or can be (for example use bx lr instead of mov pc,lr) then you will want to use bx
add lr,pc,#4
ldr r1,cfun_addr
bx r1
...
cfun_addr:
.word cfun
of course you need another register to do that and remember to push and pop your link register and the other register before and after your call to C if you want to preserve them.
Minimal runnable armv7 example
This question comes down "what is the ARM calling convention (AAPCS)". An example a.S
:
/* Make the glibc symbols visible. */
.extern exit, puts
.data
msg: .asciz "hello world"
.text
.global main
main:
/* r0 is the first argument. */
ldr r0, =msg
bl puts
mov r0, #0
bl exit
Then on Ubuntu 16.04:
sudo apt-get install gcc-arm-linux-gnueabihf qemu-user-static
# Using GCC here instead of as + ld without arguments is needed
# because GCC knows where the C standard library is.
arm-linux-gnueabihf-gcc -o a.out a.S
qemu-arm-static -L /usr/arm-linux-gnueabihf a.out
Output:
hello world
The easiest mistake to make in more complex examples is to forget that the stack must be 8 byte aligned. E.g., you want:
push {ip, lr}
instead of:
push {lr}
Example on GitHub with the boilerplate generalized: https://github.com/cirosantilli/arm-assembly-cheat/blob/82e915e1dfaebb80683a4fd7bba57b0aa99fda7f/c_from_arm.S
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With