How is the C# Stack accessed by the CLR?

https://stackoverflow.com/questions/11741415

23-06-2021
|

Pergunta

this might be a very simple question, but I could not find an answer here on SO nor knew anyone I asked an answer:

I can write an easy c# method like this:

private void foo()
{
   int a = 1;
   int b = 5;
}

If the CIL-Code (created by the compiler) gets executed by the Common Language Runtime, it will create the following fields on top of the stack while the executing control is inside the method:

b = 5
a = 1

But now, I extend the Method to access the field called "a" to this:

private void foo()
{
   int a = 1;
   int b = 5;
   Console.WriteLine(a);
}

Now the CLR has to access a field which is not on top of the stack, but according to the FILO (first in, last out) principle, it has to take care of all fields above the requested fields before accessing it.

What happens to the field called "b" which is on the stack above the requested field "a"?

The CLR cant delete it, as it might be used by the executing method afterwards, so what happens to it?

AFAIK, there are only 2 ways to store a field, stack or heap. Moving it to the heap would'nt make much sense as this would take all the benefits from stacking from the CLR. Does the CLR create something like a second stack?

How does that work exactly?

-edit-

Maybe I didn't explain my intentions clear enough.

If i write a Method like this:

private void foo()
{
   int a = 1;
   int b = 5;
   Console.WriteLine(a);
   Console.WriteLine(b);
}

The CLR first writes 2 fields on the stack and accesses them afterwards, but in reversed order.

First, it has to access field "a", but to get to it, the CLR has to take care of field "b" which lies above field "a" on the stack. It cant just remove field "b" from the stack as it has to access it afterwards.

How does that work?

Solução

Please note that, while you're talking about fields, a and b are called local variables.

Maybe the following simplified logical representation can clear up things. Before the call to Console.WriteLine, the top of the stack would look something like this:

|5| // b
|1| // a

Inside Console.WriteLine, an additional stackframe is added for its parameter (called value, which gets a copy of the variable a):

|1| // value = a
|5| // b
|1| // a

Once Console.WriteLine returns, the top frame is popped and the stack becomes again:

|5| // b
|1| // a

Outras dicas

Variables aren't stacked individually; the stack contains "frames." Each frame contains all variables (locals, parameters, etc) required by the current method call. So in your example, a and b exist alongside eachother in the same frame, and there's no need to remove either of them. When the method foo completes, the entire stack frame is popped from the stack, leaving the calling method's frame at the top.

The wikpedia article may provide some enlightenment.

The call stack is not strictly a "pure" stack where you can interact only with the top element. In the call stack you're stacking whole function calls and/or whole variable scopes, not variables.

For example, if a new function, say foo(), is called, it places its two variables, a and b, on top of the stack and has full access to them. It is (normally) not aware of anything below those variables on the stack.

Let's take a look at this code:

void foo() { // << Space is allocated on the stack for a and b.
             // << Anything in this scope has full access to a and b.
             // << But you cannot (normally) access anything from the
             // << calling function.
    var a = 1;
    var b = 2;

    if (a == 1) {  // << Another variable scope is placed on the stack.
                   // << From here you can access a, b and c.
        var c = 3;
    } // << c is removed from the stack.
} // << a, b and anything else in foo() is removed from the stack.

You've got the wrong mental image about the stack, it only acts like a stack between method calls. Within a method, the stack frame acts like an array of local variables. There's also nothing special about the stack frame of managed code, it operates exactly like the stack frame used in native C or C++ code.

Local variables have a fixed offset from the EBP register, the stack frame pointer. That offset is determined by the JIT compiler.

The specific outcome of the code you posted is that the optimizer built into the just-in-time compiler will just eliminate local variables that are not used. The a variable in the last example will very likely end up in a cpu register and never on the stack. A standard optimization.

When it comes to the CLR, it's better to think of local variables as numbered 'slots', like mailboxes. Whether the values stored in those 'slots' wind up in the method's stack frame (others here have covered that concept), stored in CPU registers or even optimized out completely are jitter details. For more information, see the IL Stloc instruction.

It's better to think about the CLR running an execution stack, with values being popped and pushed based on instructions being executed. The underlying details of how the managed code is jitted and executed on the CPU are a separate matter, which is where traditional stack frames, registers and pointer dereferencing come back into play. From the perspective of the CLR at the IL level, however, these things are (mostly) immaterial.

There are four related, but distinct concepts: local variables in C#, local variables in CIL, the stack in CIL and the native stack.

Note that how C# locals map into CIL and how do CIL locals and stack map into native memory is implementation defined, so you shouldn't rely on any of this.

You know what C# locals are. They can be represented as CIL locals, but they usually don't go on to the CIL stack (there could be some optimizations in the C# compiler that do so). But there are also few other options: the local can be optimized away completely, if it's not needed, or it could be compiled as a field in a class with an unspeakable name (closure variables of lambdas, variables in yield method or async methods). Also, even if some C# locals are compiled as CIL locals, they don't have to map 1:1, because one CIL local can be used for more C# locals, if the compiler knows doing that is safe.

In CIL, there are local variables and there is the stack. Local variables are completely separate from the stack and there are different CIL instructions for working with each of them. Local variables are used to keep values that are needed for longer time and each local can be accessed at any time. The CIL stack contains mostly values that are being used right now: parameters to instructions and their return values. In the stack, only the top value can be accessed.

Both CIL locals and the CIL stack are actually placed on the native stack, but they are often just in registers, if they fit. And of course the JIT compiler can do any other optimizations. As others said, any value in the current method's stack can be accessed at any time, not just the top.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a StackOverflow