Pregunta

I'm trying to write some self modifying code in C and MIPS.

Since I want to modify the code later on, I'm trying to write actual machine instructions (as opposed to inline assembly) and am trying to execute those instructions. Someone told me that it would be possible to just malloc some memory, write the instructions there, point a C function pointer to it and then jump to it. (I include the example below)

I've tried this with my cross compiler (sourcery codebench toolchain) and it doesn't work (yes, in hind sight I suppose it does seem rather naive). How could I properly do this?

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>


void inc(){
    int i = 41;
    uint32_t *addone = malloc(sizeof(*addone) * 2); //we malloc space for our asm function
    *(addone) = 0x20820001; // this is addi $v0 $a0 1, which adds one to our arg (gcc calling con)
    *(addone + 1) = 0x23e00000; //this is jr $ra

    int (*f)(int x) = addone; //our function pointer
    i = (*f)(i);
    printf("%d",i);    
}

int main(){
    inc();
exit(0);}

I follow the gcc calling convention here, where the arguments are passed to $a0 and the results of the functions are expected to be in $v0. I don't actually know if the return address will be put into $ra (but I can't test it yet since I can't compile. I use int for my instructions because I'm compiling MIPS32(hence a 32 bit int should be enough)

¿Fue útil?

Solución

The OP's code as written compiles without errors with Codesourcery mips-linux-gnu-gcc.

As others have mentioned above, self modifying code on MIPS requires the instruction cache to be synchronized with the data cache after the code is written. The MIPS32R2 version of the MIPS architecture added the SYNCI instruction which is a user mode instruction that does what you need here. All modern MIPS CPUs implement MIPS32R2, including SYNCI.

Memory protection is an option on MIPS, but most MIPS CPUs are not built with this feature selected, so using the mprotect system call is likely not needed on most real MIPS hardware.

Note that if you use any optimization besides -O0 the compiler can and does optimize away the stores to *addone and the function call, which breaks your code. Using the volatile keyword prevents the compiler from doing this.

The following code generates correct MIPS assembly, but I don't have MIPS hardware handy to test it on:

int inc() {
    volatile int i = 41;
    // malloc 8 x sizeof(int) to allocate 32 bytes ie one cache line,
    // also ensuring that the address of function addone is aligned to
    // a cache line.
    volatile int *addone = malloc(sizeof(*addone) * 8);
    *(addone)     = 0x20820001; // this is addi $v0 $a0 1
    *(addone + 1) = 0x23e00000; //this is jr $ra
    // use a SYNCI instruction to flush the data written above from
    // the D cache and to flush any stale data from the I cache
    asm volatile("synci 0(%0)": : "r" (addone));
    volatile int (*f)(int x) = addone; //our function pointer
    int j = (*f)(i);
    return j;
}

int main(){
    int k = 0;
    k = inc();
    printf("%d",k);    
    exit(0);
}

Otros consejos

You are using pointers inappropriately. Or, to be more accurate, you aren't using pointers where you should be.

Try this on for size:

uint32_t *addone = malloc(sizeof(*addone) * 2);
addone[0] = 0x20820001; // addi $v0, $a0, 1
addone[1] = 0x23e00000; // jr $ra

int (*f)(int x) = addone; //our function pointer
i = (*f)(i);
printf("%d\n",i);

You may also need to set the memory as executable after writing to it, but before calling it:

mprotect(addone, sizeof(int) * 2, PROT_READ | PROT_EXEC);

To make this work, you may additionally need to allocate a considerably larger block of memory (4k or so) so that the address is page-aligned.

You also need to make sure that the memory in question is executable, and makes sure it gets flushed properly from the dcache after writing it and loaded into the icache before executing it. How to do that depends on the OS running on your mips machine.

On Linux, you would use the mprotect system call to make the memory executable, and the cacheflush system call to do the cache flushing.

edit

Example:

#include <unistd.h>
#include <sys/mman.h>
#include <asm/cachecontrol.h>

#define PALIGN(P)  ((char *)((uintptr_t)(P) & (pagesize-1)))
uintptr_t  pagesize;

void inc(){
    int i = 41;
    uint32_t *addone = malloc(sizeof(*addone) * 2); //we malloc space for our asm function
    *(addone) = 0x20820001; // this is addi $v0 $a0 1, which adds one to our arg (gcc calling con)
    *(addone + 1) = 0x23e00000; //this is jr $ra

    pagesize = sysconf(_SC_PAGESIZE);  // only needs to be done once
    mprotect(PALIGN(addone), PALIGN(addone+1)-PALIGN(addone)+pagesize,
             PROT_READ | PROT_WRITE | PROT_EXEC);
    cacheflush(addone, 2*sizeof(*addone), ICACHE|DCACHE);

    int (*f)(int x) = addone; //our function pointer
    i = (*f)(i);
    printf("%d",i);    
}

Note that we make the entire page(s) containing the code both writable and executable. That's because memory protection works per page, and we want malloc to be able to continue to use the rest of the page(s) for other things. You could instead use valloc or memalign to allocate entire pages, in which case you could make the code read-only executable safely.

Calling a function is much more complicated than just jumping to an instruction.

  • How are arguments passed? Are they stored in registers, or pushed to the call stack?

  • How is a value returned?

  • Where is the return address placed for the return jump? If you have a recursive function, $ra doesn't cut it.

  • Is the caller or the callee responsible for popping the stack frame when the called function completes?

Different calling conventions have different answers to these questions. Though I've never tried anything like what you're doing, I would assume you'd have to write your machine code to match a convention, then tell the compiler that your function pointer uses that convention (different compilers have different ways of doing this - gcc does it with function attributes).

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top