Initial state of program registers and stack on Linux ARM

https://stackoverflow.com/questions/1802783

05-07-2019
|

Question

I'm currently playing with ARM assembly on Linux as a learning exercise. I'm using 'bare' assembly, i.e. no libcrt or libgcc. Can anybody point me to information about what state the stack-pointer and other registers will at the start of the program before the first instruction is called? Obviously pc/r15 points at _start, and the rest appear to be initialised to 0, with two exceptions; sp/r13 points to an address far outside my program, and r1 points to a slightly higher address.

So to some solid questions:

What is the value in r1?
Is the value in sp a legitimate stack allocated by the kernel?
If not, what is the preferred method of allocating a stack; using brk or allocate a static .bss section?

Any pointers would be appreciated.

Solution

Here's what I use to get a Linux/ARM program started with my compiler:

/** The initial entry point.
 */
asm(
"       .text\n"
"       .globl  _start\n"
"       .align  2\n"
"_start:\n"
"       sub     lr, lr, lr\n"           // Clear the link register.
"       ldr     r0, [sp]\n"             // Get argc...
"       add     r1, sp, #4\n"           // ... and argv ...
"       add     r2, r1, r0, LSL #2\n"   // ... and compute environ.
"       bl      _estart\n"              // Let's go!
"       b       .\n"                    // Never gets here.
"       .size   _start, .-_start\n"
);

As you can see, I just get the argc, argv, and environ stuff from the stack at [sp].

A little clarification: The stack pointer points to a valid area in the process' memory. r0, r1, r2, and r3 are the first three parameters to the function being called. I populate them with argc, argv, and environ, respectively.

OTHER TIPS

Since this is Linux, you can look at how it is implemented by the kernel.

The registers seem to be set by the call to start_thread at the end of load_elf_binary (if you are using a modern Linux system, it will almost always be using the ELF format). For ARM, the registers seem to be set as follows:

r0 = first word in the stack
r1 = second word in the stack
r2 = third word in the stack
sp = address of the stack
pc = binary entry point
cpsr = endianess, thumb mode, and address limit set as needed

Clearly you have a valid stack. I think the values of r0-r2 are junk, and you should instead read everything from the stack (you will see why I think this later). Now, let's look at what is on the stack. What you will read from the stack is filled by create_elf_tables.

One interesting thing to notice here is that this function is architecture-independent, so the same things (mostly) will be put on the stack on every ELF-based Linux architecture. The following is on the stack, in the order you would read it:

The number of parameters (this is argc in main()).
One pointer to a C string for each parameter, followed by a zero (this is the contents of argv in main(); argv would point to the first of these pointers).
One pointer to a C string for each environment variable, followed by a zero (this is the contents of the rarely-seen envp third parameter of main(); envp would point to the first of these pointers).
The "auxiliary vector", which is a sequence of pairs (a type followed by a value), terminated by a pair with a zero (AT_NULL) in the first element. This auxiliary vector has some interesting and useful information, which you can see (if you are using glibc) by running any dynamically-linked program with the LD_SHOW_AUXV environment variable set to 1 (for instance LD_SHOW_AUXV=1 /bin/true). This is also where things can vary a bit depending on the architecture.

Since this structure is the same for every architecture, you can look for instance at the drawing on page 54 of the SYSV 386 ABI to get a better idea of how things fit together (note, however, that the auxiliary vector type constants on that document are different from what Linux uses, so you should look at the Linux headers for them).

Now you can see why the contents of r0-r2 are garbage. The first word in the stack is argc, the second is a pointer to the program name (argv[0]), and the third probably was zero for you because you called the program with no arguments (it would be argv[1]). I guess they are set up this way for the older a.out binary format, which as you can see at create_aout_tables puts argc, argv, and envp in the stack (so they would end up in r0-r2 in the order expected for a call to main()).

Finally, why was r0 zero for you instead of one (argc should be one if you called the program with no arguments)? I am guessing something deep in the syscall machinery overwrote it with the return value of the system call (which would be zero since the exec succeeded). You can see in kernel_execve (which does not use the syscall machinery, since it is what the kernel calls when it wants to exec from kernel mode) that it deliberately overwrites r0 with the return value of do_execve.

Here's the uClibc crt. It seems to suggest that all registers are undefined except r0 (which contains a function pointer to be registered with atexit()) and sp which contains a valid stack address.

So, the value you see in r1 is probably not something you can rely on.

Some data are placed on the stack for you.

I've never used ARM Linux but I suggest you either look at the source for the libcrt and see what they do, or use gdb to step into an existing executable. You shouldn't need the source code just step through the assembly code.

Everything you need to find out should happen within the very first code executed by any binary executable.

Hope this helps.

Tony

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow