How does machine code access parameters to a subroutine call?

https://stackoverflow.com/questions/8393341

28-10-2019
|

Question

When running a program you can pass paramters, e.g.

$ myProgram par1 par2 par3

In C you can access these paramters by looking at argv,

int main (int argc, char *argv[]) 
{
     char* aParameter = argv[1];  // Not sure if this is 100% right but you get the idea...
}

How would this translate in assembly / x86 machine code? How would you access the variables given to you? How would the system give you these variables?

Im very new to assembly, it seams you can only access registers and absolute addresses. I am puzzled how you could access parameters. Does the system preload the parameters into a special register for you?

Solution

Function calls

Parameters are usually passed on the stack, which is a part of memory that is pointed to by esp. The operating system is responsible for reserving some memory for the stack and then setting up esp properly before passing control to your program.

A normal function call could look something like this:

main:
  push 456
  push 123
  call MyFunction
  add esp, 8
  ret

MyFunction:
   ; [esp+0] will hold the return address
   ; [esp+4] will hold the first parameter (123)
   ; [esp+8] will hold the second parameter (456)
   ;
   ; To return from here, we usually execute a 'ret' instruction,
   ; which is actually equivalent to:
   ;
   ; add esp, 4
   ; jmp [esp-4]

   ret

There are different responsibilities split between the calling function and the function that is being called, with regards to how they promise to preserve registers. These rules are referred to as calling conventions.

The example above uses the cdecl calling convention, which means that parameters are pushed onto the stack in reverse order, and the calling function is responsible for restoring esp back to where it pointed before those parameters were pushed to the stack. That's what add esp, 8 does.

Main function

Typically, you write a main function in assembly and assemble it into an object file. You then pass this object file to a linker to produce an executable.

The linker is responsible for producing startup code that sets up the stack properly before control is passed to your main function, so that your function can act as if it were called with two arguments (argc/argv). That is, your main function is not the real entry point, but the startup code jumps there after it has set up the argc/argv arguments.

Startup code

So how does this "startup code" look? The linker will produce it for us, but it's always interesting to know how stuff works.

This is platform specific, but I'll describe a typical case on Linux. This article, while dated, explains the stack layout on Linux when an i386 program starts. The stack will look like this:

esp+00h: argc
esp+04h: argv[0]
esp+08h: argv[1]
esp+1Ch: argv[2]
...

So the startup code can get the argc/argv values from the stack and then call main(...) with two parameters:

; This is very incomplete startup code, but it illustrates the point

mov eax, [esp]        ; eax = argc
lea edx, [esp+0x04]   ; edx = argv

; push argv, and argc onto the stack (note the reverse order)
push edx
push eax
call main
;
; When main returns, use its return value (eax)
; to set an exit status
;
...

OTHER TIPS

The C-runtime is doing some work for you here - it fetches the program arguments from the OS and parses them if necessary before involking your main function. In asemmbler, you'll have to fetch the command arguments and parse them yourself. How you get the program arguments is OS specific.

In the same way your program does; you just have to do it manually.

Arguments to functions are stored in various registers/memory segments before the function is called. When you call a function in assembly you have to setup the stack manually before the call. The calling convention decides where these variables go, how they are ordered, and how they are accessed.

For example, argc and argv would be created and pushed onto the stack. The data they point to would have already been created as well. When the function is called it knows that arguments 1..n will have been placed in some section of memory according to the calling convention.

Here is a quick rundown on calling conventions with some examples as to how the stack would be setup before calling a function.

On a side note, some amount of work has to be done before main is called, and this is hidden from you. This is a good thing; we don't want to write a bunch of bootstrap code every time we begin a new project.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow