why this piece of code can get the function address from return address?

https://stackoverflow.com/questions/20929834

24-09-2022
|

Question

return_address is obtain by writing a small piece of assembly code getting the ebp and hence we can get the return address by increment the ebp by 4. Here return_address is of type int but we can cast it to int*

 int extract_function_address(int return_address) {
        int *offset_address_ptr = (int*)(return_address - 5 + 1);
        int offset = *offset_address_ptr;   
        int func_address = return_address + offset;

        return func_address;
    }

I use gdb to step through it

(gdb) disas bar
Dump of assembler code for function bar:
   0x08048304 <+0>: push   %ebp
   0x08048305 <+1>: mov    %esp,%ebp
   0x08048307 <+3>: sub    $0x8,%esp
   0x0804830a <+6>: mov    0xc(%ebp),%eax
   0x0804830d <+9>: mov    0x8(%ebp),%edx
   0x08048310 <+12>:    add    %edx,%eax
   0x08048312 <+14>:    mov    %eax,-0x4(%ebp)
   0x08048315 <+17>:    mov    -0x4(%ebp),%eax
   0x08048318 <+20>:    mov    %eax,0x8(%ebp)
   0x0804831b <+23>:    mov    0x81e2460,%eax
   0x08048320 <+28>:    mov    %eax,(%esp)
   0x08048323 <+31>:    call   0x8048358 <traceback>
   0x08048328 <+36>:    leave  
   0x08048329 <+37>:    ret    
End of assembler dump.


(gdb) disas foo
Dump of assembler code for function foo:
   0x0804832a <+0>: push   %ebp
   0x0804832b <+1>: mov    %esp,%ebp
   0x0804832d <+3>: sub    $0x8,%esp
   0x08048330 <+6>: movl   $0x11,0x4(%esp)
   0x08048338 <+14>:    movl   $0x5,(%esp)
   0x0804833f <+21>:    call   0x8048304 <bar>
   0x08048344 <+26>:    leave  
   0x08048345 <+27>:    ret    
End of assembler dump.

I passed return address as 0x08048344 to the function. The offset will be -64 and the return value will be 0x8048304 which is the starting address of bar.

Why is this work?

This is the C file where bar and foo locate

#include "traceback.h"
#include <stdio.h>

void bar(int x, int y)
{
  int z;
  z = x + y;
  traceback(stdout);
}

void foo() {
  bar (5,17);
}

int main (int argc, char **argv)
{
  foo();
  return 0;
}

I put that piece of code in traceback(FILE *fp).

Solution

A call instruction assembles to E8 AA BB CC DD where AA BB CC DD is the offset of the target function from the instruction following the call, i.e. from the return address. Try x/5bx 0x0804833f in gdb to see the encoded instruction. Note that the offset would be in little endian byte order.

Therefore, (return_address - 5 + 1) points to the offset of the call instruction. offset = *offset_address_ptr reads this offset from the call instruction and return_address + offset points to the target function.

OTHER TIPS

I'm not sure, but it looks like the code fethes call-address from the instruction just before the return location.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow