Using C++ with assembly to allocate and create new functions at runtime

https://stackoverflow.com/questions/10456245

05-06-2021
|

Question

I've been working on a (C++) project, which requires completely dynamically allocated functions, which means malloc/new and mprotect and then modify the buffer manually to assembly code. Because of this I've wondered exactly, what is required in this "buffer" of mine, for it to be a replicate of any other _cdecl function. For example:

int ImAcDeclFunc(int a, int b)
{
     return a + b;
}

If I would like to literally create a duplicate of this function, but completely dynamically, what would that require (and remember it's C++ with inline assembly)? For starters, I guess I would have to do something like this (or a similiar solution):

// My main....
byte * ImAcDeclFunc = new byte[memory];
mprotect(Align(ImAcDeclFunc), pageSize, PROT_EXEC | PROT_READ | PROT_WRITE);

After this I would have to find out the assembly code for the ImAcDeclFunc(int a, int b);. Now I'm still lousy at assembly, so how would this function be in AT&T syntax? Here's my bold attempt:

push %ebp
movl %%ebp, %%esp
movl 8(%ebp), %%eax
movl 12(%ebp), %%edx
addl edx, eax
pop ebp
ret

Now if this code is correct (which I highly doubt, please correct me) would I only need to find this code's value in hex (for example, 'jmp' is 0xE9 and 'inc' is 0xFE), and use these values directly in C++? If I continue my previous C++ code:

*ImAcDeclFunc = 'hex value for push'; // This is 'push' from the first line
*(uint)(ImAcDeclFunc + 1) = 'address to push'; // This is %ebp from the first line
*(ImAcDeclFunc + 5) = 'hex value for movl' // This is movl from the second line
// and so on...

After I've done this for the whole code/buffer, would that be enough for a completely dynamic _cdecl function (i.e could I just cast it to a function pointer and do int result = ((int (*)(int, int))ImAcDeclFunc)(firstArg, secondArg)?). And I'm not interested in using boost::function or something similiar, I need the function to be completely dynamic, therefore my interest :)

NOTE: This question is a continuation on my previous one, but with far more specifics.

Solution

If you take this lala.c:

int ImAcDeclFunc(int a, int b)
{
    return a + b;
}

int main(void)
{
    return 0;
}

You can compile it with gcc -Wall lala.c -o lala. You can then disassemble the executable with objdump -Dslx lala >> lala.txt. You will find ImAcDeclFunc is assembled to:

00000000004004c4 <ImAcDeclFunc>:
ImAcDeclFunc():
  4004c4:   55                      push   %rbp
  4004c5:   48 89 e5                mov    %rsp,%rbp
  4004c8:   89 7d fc                mov    %edi,-0x4(%rbp)
  4004cb:   89 75 f8                mov    %esi,-0x8(%rbp)
  4004ce:   8b 45 f8                mov    -0x8(%rbp),%eax
  4004d1:   8b 55 fc                mov    -0x4(%rbp),%edx
  4004d4:   8d 04 02                lea    (%rdx,%rax,1),%eax
  4004d7:   c9                      leaveq 
  4004d8:   c3                      retq

Actually this function is relatively easy to copy elsewhere. In this case, you are perfectly correct in saying that you can copy the bytes and it would just work.

Problems will occur when you start to make use of instructions that use relative offsets as part of the opcode. For example, a relative jump or a relative call. In these cases, you need to relocate the instruction properly unless you happen to be able to copy it to the same address as where it was originally.

Briefly, to relocate, you need to find where it was originally based, and calculate the difference to where you are going to base it and relocate each relative instruction with regard to this offset. This in itself is feasible. Your real difficulty is handling calls to other functions, particularly function calls to libraries. In this case you will need to guarantee that the library is linked and then call it in the way defined by the executable format you are targeting. This is highly non-trivial. If you are still interested I can point you in the direction of where you should be reading up on for this.

In your simple case above, you can do this:

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <malloc.h>
#include <sys/mman.h>
#include <unistd.h>

int main(void)
{
    char func[] = {0x55, 0x48, 0x89, 0xe5, 0x89, 0x7d, 0xfc,
    0x89, 0x75, 0xf8, 0x8b, 0x45, 0xf8,
    0x8b, 0x55, 0xfc, 0x8d, 0x04, 0x02,
    0xc9, 0xc3};

    int (* func_copy)(int,int) = mmap(NULL, sizeof(func),
        PROT_WRITE | PROT_READ | PROT_EXEC,
        MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);

    memcpy(func_copy, func, sizeof(func));
    printf("1 + 2 = %d\n", func_copy(1,2));

    munmap(func_copy, sizeof(func));
    return EXIT_SUCCESS;
}

This works fine on x86-64. It prints:

1 + 2 = 3

OTHER TIPS

You might want to check out GNU lightning: http://www.gnu.org/software/lightning/. It might help you with what you are trying to do.

I think that it'll be better idea to embed some scripting language into your project instead of writing self-modifying program. It'll take less time and you'll gain greater flexibility.

If I would like to literally create a duplicate of this function, but completely dynamically, what would that require (and remember it's C++ with inline assembly)?

It would require human with disassembler. Technically, function should start at one address and end at return statement. However, it is unknown what exactly compiler did with the function during optimization phase. I wouldn't be surprised if function entry point was located in some kind of weird place (like in the end of function, after return statement), or if function were split into multiple parts that were shared with other functions.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow