How do called functions return to their caller, after being called?

Question 1

The compiler obeys a particular "calling convention", defined as part of the ABI you're targeting. That calling convention will include a way for the system to know what address to return to. The calling convention usually takes advantage of the hardware's support for procedure calls. On Intel, for example, the return address is pushed to the stack:

...the processor pushes the value of the EIP register (which contains the offset of the instruction following the CALL instruction) on the stack (for use later as a return-instruction pointer).

Returning from a function is done via the ret instruction:

... the processor pops the return instruction pointer (offset) from the top of the stack into the EIP register and begins program execution at the new instruction pointer.

To contrast, on ARM, the return address is put in the link register:

The BL and BLX instructions copy the address of the next instruction into lr (r14, the link register).

Returns are commonly done by executing movs pc, lr to copy the address from the link register back into the program counter register.

References:

Question 2

The compiler knows how to call a function and which calling convention is used. For example in C the arguments for a function are pushed on the stack. The caller is repsonsible for clearing the stack, so the called function doesn't have to remove the arguments. Other calling conventions can include pushing the arguments on the stack and the called function has to clean it. In this case, the generated code is such, that the function corrects the stack before it can return. Ohter calling conventions may pass the arguments in registers, so in such a case the called function also doesn't have to take care.
The CPU has a mechanism to call a subroutine. This will store the current execution address on the stack and then transfer processing to the new address. When the function is done it executes a return statement, which will fetch the caller address and resume execution there.

If the return address is destroyed, because the stack is not properly cleaned uo, or the memory is overwritten, then you get undefined behaviour. Of course the exact implementation details vary depending on the platform which is used.

Question 3

This is made possible by the stack(especially on Intel-like systems). Let's say that we have a method caller that includes, say, an int that it keeps locally.

When caller( calls target( that int must be saved. It is placed on the stack, along with the address from which the call is made. target( can perform its logic, create its own local variables, and call other methods. Its local variables will be placed onto the stack along with the call's address.

When target( ends, the stack is "unrolled". The top of the stack containing target('s local variables is removed.

When methods recurse too far, the stack may grow too large, and a "stack overflow" may occur.

Question 4

It requires cooperation between the callee and the caller.

The caller agrees to give the address that the callee should return to to the callee (usually by pushing it on the stack, or by passing it in a register), and the callee agrees to return to that address when it's finished executing.