Question

I decided it would be fun to learn x86 assembly during the summer break. So I started with a very simple hello world program, borrowing on free examples gcc -S could give me. I ended up with this:

HELLO:
    .ascii "Hello, world!\12\0"
    .text

.globl _main
_main:
    pushl   %ebp        # 1. puts the base stack address on the stack
    movl    %esp, %ebp  # 2. puts the base stack address in the stack address register
    subl    $20, %esp   # 3. ???
    pushl   $HELLO      # 4. push HELLO's address on the stack
    call    _puts       # 5. call puts
    xorl    %eax, %eax  # 6. zero %eax, probably not necessary since we didn't do anything with it
    leave               # 7. clean up
    ret                 # 8. return
                        # PROFIT!

It compiles and even works! And I think I understand most of it.

Though, magic happens at step 3. Would I remove this line, my program would die between the call to puts and the xor from a misaligned stack error. And would I change $20 to another value, it'd crash too. So I came to the conclusion that this value is very important.

Problem is, I don't know what it does and why it's needed.

Can anyone explain me? (I'm on Mac OS, would it ever matter.)

Was it helpful?

Solution

On x86 OSX, the stack needs to be 16 byte aligned for function calls, see ABI doc here. So, the explanation is

push stack pointer (#1)         -4
strange increment (#3)         -20
push argument (#4)              -4
call pushes return address (#5) -4
total                          -32

To check, change line #3 from $20 to $4, which also works.

Also, Ignacio Vazquez-Abrams points out, #6 is not optional. Registers contain remnants of previous calculations so it has to explicitly be zeroed.

I recently learned (still learning) assembly, too. To save you the shock, 64bit calling conventions are MUCH different (parameters passed on the register). Found this very helpful for 64bit assembly.

OTHER TIPS

The general form of the comment should be "Allocates space for local variables". Why changing it arbitrarily would crash it I'm not sure. I can only see it crashing if you reduce it. And the proper comment for 6 is "Prepare to return a 0 from this function".

Note that if you compile with -fomit-frame-pointer some of that %ebp pointer boilerplate will disappear. The base pointer is helpful for debugging but isn't actually necessary on x86.

Also I highly recommend using Intel syntax, which is supported by all the GCC/binutils stuff. I used to think that the difference between AT&T and Intel syntax was just a matter of taste, but then one day I came across this example where the AT&T mnemonic is just totally different from the Intel one. And since all the official x86 documentation uses Intel syntax, it seems like a better way to go.

Have fun!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top