Stack-based virtual machine function call/return implementation issues

Question 1

What you're talking about is calling the call convention. In other words defining who builds the stack and how, caller or callee, and how should the stack look like.

They are many ways to do it and no one is better than the other, you just have to keep it conscistent.

As it would be to long to describe the different call convetions, you should just check the wikipedia article that is really complete.

But still quickly, the x86 C calling convention specifies that the caller must save its registers and build the stack and let the callee free of using the registers, to return a value or just simply to do things.

For the specific questions at the end of your post, the best is to have the same stack as C does, storing inside the last EIP and EBP and leave the registers free to use. Stack space is not as limited as the number of registers you have.

Question 2

I solved this problem in my stack machine I've been working on, in the following way:

A void function call (with no parameters) instruction does something like this:

There is _stack[] (the main stack), and a _cstack[] (the call stack, containing information about calls, such as return size).

When calling a function, (the VCALL (void function call) is encountered) the following is done:

        u64& _next = _peeknext; //refer to next bytecode (which will be function address)
        AssertAbort((_next > -1) && (_next < _PROGRAM_SIZE), "Can't call function. Invalid address");
        cstack_push(ip + 2); //address to return to (current address +2, to account for function parameters next to function call)
        cstack_push(fp); //curr frame pointer
        cstack_push(_STACK_SIZE); //curr stack size
        cstack_push(0); //size of return value(would be 4 if int, 8 for long etc),in this case void
        ip = (_next)-1; //address to jump to (-1 to counter iteration incrementation of program counter(ip))

Then, when a RET (return) instruction is encountered, the following is done:

        AssertAbort(cstackhas(3), "Can't return. No address to return to.");
        u64 return_size = cstack_pop(); // pop size of return value form call stack
        _STACK_SIZE = cstack_pop(); //set the stack size to what it was before the function call, not accounting for the return value size
        fp = cstack_pop(); //reset the frame pointer to the current value to where it was before the function call
        ip = cstack_pop() - 1; //set program counter to addres storedon call stack from last function call

        _cstack.resize(_STACK_SIZE + return_size); //leave the top of the stack intact (size of return value in bytes), but disregard the rest.

This is probably useless to you now, as this question is quite old, but you can ask any questions if you wish :)

Question 3

The best solution depends on the machine.

If push and pop in the stack are as fast as using registers (on chip stack or on chip L1 baked stack) and at the same time you are very limited on the number of registers it would make sense to use the stack.

If you have plenty of registers you can use some of them to store counters (pointers) or variables.

In general to make modules communicate with each other or to translate (or compile) other languages into your assembly you should specify an Application Binary Interface.

You should compare different ABIs for different hardware (or virtual machines) to find the techniques suitable for your machine. Once you define your ABI, programs should comply for binary compatibility.