Pregunta

I have three years experience working full time with .NET (C# and VB). I have a good working knowledge of MSIL and I can use it as a debugging tool.

I don't have much knowledge of the next step of the compilation process i.e. when the Jitter produces the assembly code (displayed in the dissassebly window). Hans Passant posted an answer to a question here: What is the difference between native code, machine code and assembly code?. My more experienced colleague said this is a brilliant answer, but I still don't understand the following code:

static void Main(string[] args) {
            Console.WriteLine("Hello world");
00000000 55                push        ebp                           ; save stack frame pointer
00000001 8B EC             mov         ebp,esp                       ; setup current frame
00000003 E8 30 BE 03 6F    call        6F03BE38                      ; Console.Out property getter
00000008 8B C8             mov         ecx,eax                       ; setup "this"
0000000a 8B 15 88 20 BD 02 mov         edx,dword ptr ds:[02BD2088h]  ; arg = "Hello world"
00000010 8B 01             mov         eax,dword ptr [ecx]           ; TextWriter reference
00000012 FF 90 D8 00 00 00 call        dword ptr [eax+000000D8h]     ; TextWriter.WriteLine()
00000018 5D                pop         ebp                           ; restore stack frame pointer
        }
00000019 C3                ret                                       ; done, return

Can anyone provide more information on what happens on each line and more particularly why each register is chosen e.g. why is eax chosen instead of edx? Alternatively can anyone recommend a book?

¿Fue útil?

Solución

I'm a bit rusty with this, but I'm also interested in the low level assembly side of things. Here goes:

push ebp; save stack frame pointer

Push the value stored in EBP onto the stack, so that when we return from this method, we know where we came from.

mov ebp,esp; setup current frame

Move current stack position value from ESP to EBP, so that EBP is in the context of the current method.

The preceding two lines of code are a convention that ensures there's a fixed position (stored in the EBP register) on the stack determining relative location of local variables.

call 6F03BE38; Console.Out property getter

No prizes for guessing that this is a call to Console.Out

mov ecx,eax; setup "this"

Returned values from methods are stored in EAX, which is a matter of the calling convention. Thus the returned value from Console.Out will be stored in EAX. Here, that value is copied to ECX for later use, making EAX usable for other purposes.

mov edx,dword ptr ds:[02BD2088h]; arg = "Hello world"

The register EDX is given the memory location of the string "Hello World". dword ptr ds:[02BD2088h] means dereferences the memory location ds:[02BD2088h], where ds is the data segment (where things like initialised strings are stored). [02BD2088h] is an offset in the memory region of ds.

mov eax,dword ptr [ecx]; TextWriter reference

Remember that Console.Out call? We put the returned value from that into ECX. Here, the memory address of ECX is dereferenced, so that the memory address of the TextWriter is copied into EAX. So EAX will now contain the actual memory address of the TextWriter object. If we did mov eax,dword ptr ecx; then EAX would contain the pointer to the memory address of the TextWriter, not the actual memory address of TextWriter. (I still get confused with that myself).

call dword ptr [eax+000000D8h]; TextWriter.WriteLine()

Here a call is made to TextWriter.WriteLine(). I'm assuming that TextWriter.WriteLine() is using the _fastcall calling convention (a good explanation of calling conventions can be found here) which means it uses the EDX register to find arguments passed to the method.

pop ebp; restore stack frame pointer

We remove the top-most (or bottom-most really, as stacks actually grow downwards) value into EBP, so the frame pointer in EBP now corresponds to the calling method.

ret

Return to the location found at the top of the stack, back into the calling method. In this case, as it's Main() being called, control will be returned to system code and the application will exit.

Otros consejos

The first two lines are used to set up the stack frame. The EBP register is used to store the base address of the frame of the current method.

push        ebp

saves the base address of the calling method's frame on the stack. This is restored just before the function exits with the

pop         ebp

instruction before the final ret instruction.

As the comment suggests, the

call        6F03BE38

instruction calls the Console.Out property getter. This is a static method, so the address of the method could have been inserted directly by the JIT when the current method was compiled.

Function on windows usually use the _stdcall calling convention. A calling convention specifies how arguments should be passed into a function (via the stack or through registers), in which order they should be passed onto the stack (left-to-right or right-to-left) and who is responsible for cleaning the stack after the call (the caller or callee). Since there are no arguments, it's not clear what the convention is for the getter, but it appears the return value is place in the EAX register.

The following three lines set up the call to TextWriter.WriteLine

The line:

mov         ecx,eax

moves the value in EAX into ECX. EAX contains the value returned from the Console.Out getter.

The line

mov         edx,dword ptr ds:[02BD2088h]

moves the address of the string "Hello world" into the EDX register.

The line

mov         eax,dword ptr [ecx]

copies the word at the address pointed to by ECX into EAX. EAX contains the value returned from Console.Out. Since this is a reference type, this value is a pointer to an object stored on the heap. All objects have an object header, consisting of a sync block index and a pointer to the method table. The reference itself points directly to the method table. Therefore, [ECX] is the address of the method table pointer for the TextWriter reference the method is to be invoked on.

Finally the method is called with

call        dword ptr [eax+000000D8h]

The 000000D8h is an offset into the method table which corresponds to the TextWriter.WriteLine method.

Since the this pointer and string argument are stored in ECX and EDX, this method appears to use the _fastcall convention.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top