Explain esp-ebp in this program

Question 1

Okay, this is a crazy guess but let's run it up the flag pole and see what happens.

Perhaps the compiler is not optimising for space and this is a word alignment adjustment to save shifting words for register loading across quadword boundaries.

Look at the values, 0x58 & 8bytes -> next quadword boundary 96 0x60. Much easier to pop ebp from the least (or is it most again endian? ;) ) significant of a quad word line in memory; forward thinking and all that.

Edit: exactly! (What he said...)

Question 2

This highly depends on your architecture and compiler flags, so it is impossible to point to a single thing and say "this must be it" here. However, I can give you some pointers you may find helpful.

First, consider the stack boundary. You may have heard of the -mpreferred-stack-boundary=X flag to GCC. If not, it basically tells your compiler to prefer your values on the stack to be 2^X bytes each. Your compiler will then try to optimize your program so that these values fit on the stack as best as possible. On the other hand, GCC modifier such as __packed__ will make the compiler try to fit the data in the stack as tightly as possible.

There's also the stack protector. Basically, GCC places dummy values on the stack that make sure buffer overflows can't any harm other than segfaulting your program (which isn't fun, but better than an attacker tacking control of the instruction pointer). You can easily try this out: take any recent version of GCC and let the user overflow a buffer. You'll note that the program exits with a message along the lines of 'stack smashing detected, terminated'. Try compiling your program with -fno-stack-protector, and the allocated local memory on the stack will probably be smaller.

Finally, there are some minor details about how the cdecl calling convention works that you're getting wrong. Arguments get pushed on the stack before calling a function, which means they are higher in memory on the stack (remember that the stack grows down in memory). Here's an extremely simplified example of a function that requires 3 arguments and allocates 2 local integer variables:

# First we push three arguments on the stack in reverse order as they 
# appear in C. The values don't matter here.
pushl $0xc
pushl $0xb
pushl $0xa

# A CALL instruction comes in here to get in the function. The return 
# address is placed on the stack.

# Assume we are in the function now. This function first saves the base 
# pointer, then sets the base pointer to the address in the stack pointer.
pushl %ebp
movl %esp, %ebp

# Now we can allocate our local variables. We need 8 bytes of space for 
# those 2 integer variables (note that this is an extremely simplified 
# example that doesn't consider what I just told you above).
subl $0x8, %esp
# Let's just put 1 and 2 in those variables.
movl $0x1, -4(%ebp)
movl $0x2, -8(%ebp)

# We're done. Put a return value in EAX, then restore the stack- and 
# base pointers.
movl $0x0, %eax
movl %ebp, %esp
popl %ebp
ret

So basically, our stack looks somewhat like this:

16(%ebp)     -> Argument 3
12(%ebp)     -> Argument 2
8(%ebp)      -> Argument 1
4(%ebp)      -> Return address
%ebp         -> Old %ebp pushed on the stack by function
-4(%ebp)     -> Local variable 1
-8(%ebp)     -> Local variable 2

In other words, only the local variables are in lower memory than the base pointer. Honestly, there's probably a few more things that can influence the size of local variables on the stack I've forgotten to include, but I hope this helps you out a bit. Keep hacking around with your program, and you'll figure it out. :)

Question 3

There are many reasons for the compiler to reserve more space. Apart from what others have said, if you're on MSVC, maybe that's the edit and continue feature. Without the compiler name and compile options I can't tell you more