I'm currently playing with ARM assembly on Linux as a learning exercise. I'm using 'bare' assembly, i.e. no libcrt or libgcc. Can anybody point me to information about what state the stack-pointer and other registers will at the start of the program before the first instruction is called? Obviously pc/r15 points at _start, and the rest appear to be initialised to 0, with two exceptions; sp/r13 points to an address far outside my program, and r1 points to a slightly higher address.
So to some solid questions:
Any pointers would be appreciated.
The processor uses full descending stacks, which means that register R13, the Stack Pointer, holds the address of the last stacked item in memory.
ARM processors have 37 registers. The registers are arranged in partially overlapping banks. There is a different register bank for each processor mode. The banked registers give rapid context switching for dealing with processor exceptions and privileged operations.
The stack is an area of SRAM that is used to temporarily store the contents of general purpose registers. A register is saved to the stack using an operation known as a PUSH operation. A register is restored from the stack using a POP operation.
Since this is Linux, you can look at how it is implemented by the kernel.
The registers seem to be set by the call to start_thread
at the end of load_elf_binary
(if you are using a modern Linux system, it will almost always be using the ELF format). For ARM, the registers seem to be set as follows:
r0 = first word in the stack
r1 = second word in the stack
r2 = third word in the stack
sp = address of the stack
pc = binary entry point
cpsr = endianess, thumb mode, and address limit set as needed
Clearly you have a valid stack. I think the values of r0
-r2
are junk, and you should instead read everything from the stack (you will see why I think this later). Now, let's look at what is on the stack. What you will read from the stack is filled by create_elf_tables
.
One interesting thing to notice here is that this function is architecture-independent, so the same things (mostly) will be put on the stack on every ELF-based Linux architecture. The following is on the stack, in the order you would read it:
argc
in main()
).argv
in main()
; argv
would point to the first of these pointers).envp
third parameter of main()
; envp
would point to the first of these pointers).AT_NULL
) in the first element. This auxiliary vector has some interesting and useful information, which you can see (if you are using glibc) by running any dynamically-linked program with the LD_SHOW_AUXV
environment variable set to 1
(for instance LD_SHOW_AUXV=1 /bin/true
). This is also where things can vary a bit depending on the architecture.Since this structure is the same for every architecture, you can look for instance at the drawing on page 54 of the SYSV 386 ABI to get a better idea of how things fit together (note, however, that the auxiliary vector type constants on that document are different from what Linux uses, so you should look at the Linux headers for them).
Now you can see why the contents of r0
-r2
are garbage. The first word in the stack is argc
, the second is a pointer to the program name (argv[0]
), and the third probably was zero for you because you called the program with no arguments (it would be argv[1]
). I guess they are set up this way for the older a.out
binary format, which as you can see at create_aout_tables
puts argc
, argv
, and envp
in the stack (so they would end up in r0
-r2
in the order expected for a call to main()
).
Finally, why was r0
zero for you instead of one (argc
should be one if you called the program with no arguments)? I am guessing something deep in the syscall machinery overwrote it with the return value of the system call (which would be zero since the exec succeeded). You can see in kernel_execve
(which does not use the syscall machinery, since it is what the kernel calls when it wants to exec from kernel mode) that it deliberately overwrites r0
with the return value of do_execve
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With