Initial state of program registers and stack on Linux ARM
Question
I'm currently playing with ARM assembly on Linux as a learning exercise. I'm using 'bare' assembly, i.e. no libcrt or libgcc. Can anybody point me to information about what state the stack-pointer and other registers will at the start of the program before the first instruction is called? Obviously pc/r15 points at _start, and the rest appear to be initialised to 0, with two exceptions; sp/r13 points to an address far outside my program, and r1 points to a slightly higher address.
So to some solid questions:
- What is the value in r1?
- Is the value in sp a legitimate stack allocated by the kernel?
- If not, what is the preferred method of allocating a stack; using brk or allocate a static .bss section?
Any pointers would be appreciated.
Solution
Here's what I use to get a Linux/ARM program started with my compiler:
/** The initial entry point.
*/
asm(
" .text\n"
" .globl _start\n"
" .align 2\n"
"_start:\n"
" sub lr, lr, lr\n" // Clear the link register.
" ldr r0, [sp]\n" // Get argc...
" add r1, sp, #4\n" // ... and argv ...
" add r2, r1, r0, LSL #2\n" // ... and compute environ.
" bl _estart\n" // Let's go!
" b .\n" // Never gets here.
" .size _start, .-_start\n"
);
As you can see, I just get the argc, argv, and environ stuff from the stack at [sp].
A little clarification: The stack pointer points to a valid area in the process' memory. r0, r1, r2, and r3 are the first three parameters to the function being called. I populate them with argc, argv, and environ, respectively.
OTHER TIPS
Since this is Linux, you can look at how it is implemented by the kernel.
The registers seem to be set by the call to start_thread
at the end of load_elf_binary
(if you are using a modern Linux system, it will almost always be using the ELF format). For ARM, the registers seem to be set as follows:
r0 = first word in the stack
r1 = second word in the stack
r2 = third word in the stack
sp = address of the stack
pc = binary entry point
cpsr = endianess, thumb mode, and address limit set as needed
Clearly you have a valid stack. I think the values of r0
-r2
are junk, and you should instead read everything from the stack (you will see why I think this later). Now, let's look at what is on the stack. What you will read from the stack is filled by create_elf_tables
.
One interesting thing to notice here is that this function is architecture-independent, so the same things (mostly) will be put on the stack on every ELF-based Linux architecture. The following is on the stack, in the order you would read it:
- The number of parameters (this is
argc
inmain()
). - One pointer to a C string for each parameter, followed by a zero (this is the contents of
argv
inmain()
;argv
would point to the first of these pointers). - One pointer to a C string for each environment variable, followed by a zero (this is the contents of the rarely-seen
envp
third parameter ofmain()
;envp
would point to the first of these pointers). - The "auxiliary vector", which is a sequence of pairs (a type followed by a value), terminated by a pair with a zero (
AT_NULL
) in the first element. This auxiliary vector has some interesting and useful information, which you can see (if you are using glibc) by running any dynamically-linked program with theLD_SHOW_AUXV
environment variable set to1
(for instanceLD_SHOW_AUXV=1 /bin/true
). This is also where things can vary a bit depending on the architecture.
Since this structure is the same for every architecture, you can look for instance at the drawing on page 54 of the SYSV 386 ABI to get a better idea of how things fit together (note, however, that the auxiliary vector type constants on that document are different from what Linux uses, so you should look at the Linux headers for them).
Now you can see why the contents of r0
-r2
are garbage. The first word in the stack is argc
, the second is a pointer to the program name (argv[0]
), and the third probably was zero for you because you called the program with no arguments (it would be argv[1]
). I guess they are set up this way for the older a.out
binary format, which as you can see at create_aout_tables
puts argc
, argv
, and envp
in the stack (so they would end up in r0
-r2
in the order expected for a call to main()
).
Finally, why was r0
zero for you instead of one (argc
should be one if you called the program with no arguments)? I am guessing something deep in the syscall machinery overwrote it with the return value of the system call (which would be zero since the exec succeeded). You can see in kernel_execve
(which does not use the syscall machinery, since it is what the kernel calls when it wants to exec from kernel mode) that it deliberately overwrites r0
with the return value of do_execve
.
Here's the uClibc crt. It seems to suggest that all registers are undefined except r0
(which contains a function pointer to be registered with atexit()
) and sp
which contains a valid stack address.
So, the value you see in r1
is probably not something you can rely on.
Some data are placed on the stack for you.
I've never used ARM Linux but I suggest you either look at the source for the libcrt and see what they do, or use gdb to step into an existing executable. You shouldn't need the source code just step through the assembly code.
Everything you need to find out should happen within the very first code executed by any binary executable.
Hope this helps.
Tony