Question

I am trying to create an assembler which is able to encode instructions at runtime (for a JIT compiler). Sorry for the long code snippet, but this is the shortest compilable example which shows my problem.

#include <stdint.h>
#include <iostream>

#include <windows.h>

typedef void (*function)();
uint8_t* instructionBuffer;
uint32_t pos;

/**
 * Creates the instruction buffer;
 */
void assembler_initialize() {
    instructionBuffer = (uint8_t*) VirtualAllocEx(GetCurrentProcess(), 0, 1024,
    MEM_COMMIT, PAGE_EXECUTE_READWRITE);
    pos = 0;
}

/**
 * Writes a call to the given address to the instruction buffer
 */
void assembler_emit_call(uint32_t value) {
    // CALL opcode
    instructionBuffer[pos++] = 0xFF;

    // opcode extension 2, read a 32bit address
    instructionBuffer[pos++] = 0x15;

    // Address as little endian
    instructionBuffer[pos++] = (value >> 0) & 0xFF;
    instructionBuffer[pos++] = (value >> 8) & 0xFF;
    instructionBuffer[pos++] = (value >> 16) & 0xFF;
    instructionBuffer[pos++] = (value >> 24) & 0xFF;
}

/**
 * Writes a RET to the instruction buffer
 */
void assembler_emit_ret() {
    instructionBuffer[pos++] = 0xC3;
}

/**
 * The function to call
 */
void __cdecl myFunction() {
    std::cout << "Hello world!" << std::endl;
}

/**
 *
 */
int main(int argc, char **argv) {
    assembler_initialize();
    assembler_emit_call((uint32_t) &myFunction);
    assembler_emit_ret();

    // Output the address
    std::cout << std::hex << (uint32_t) &myFunction << std::endl;

    // Output the opcodes
    for (uint32_t i = 0; i < 100; i++) {
        std::cout << std::hex << (uint32_t) instructionBuffer[i] << " ";
    }
    std::cout << std::endl;

    // Call the function
    function f = (function) instructionBuffer;
    f();

    return 0;
}

The output tells me, that the address of myFunction is 0x4017c5, and that these opcodes were written:

CALL   ModRM   Addr (le)     RET    Zeros
ff     15      c5 17 40 0    c3     0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...

Still, my program crashes when trying to execute the code. Did I miss something when encoding the CALL instruction?

Était-ce utile?

La solution

It doesn't work because the call instruction is incorrect. Actually there is no CALL absolute_address instruction on the x86.

In your example you generate following X86 code:

FF 15 xx xx xx xx

which is an indirect call to the adress xx xx xx xx. This will take the address found at xx xx xx xx and do the call there.

Example

FF 15 10 20 30 00

This will look at the adress 0x302010 :

00302010: 11 22 33 00 xx xx xx xx

where it finds the value 0x00332211 and the calls the function at that address.

With the following modifications in assembler_emit_call the program works fine.

void assembler_emit_call(uint32_t value) {
    // CALL opcode
    instructionBuffer[pos++] = 0xb8;  // mov  eax, address

    // Address as little endian
    instructionBuffer[pos++] = (value >> 0) & 0xFF;
    instructionBuffer[pos++] = (value >> 8) & 0xFF;
    instructionBuffer[pos++] = (value >> 16) & 0xFF;
    instructionBuffer[pos++] = (value >> 24) & 0xFF;

    instructionBuffer[pos++] = 0xff ;  // call eax
    instructionBuffer[pos++] = 0xd0 ;
    instructionBuffer[pos++] = 0xc3 ;  // ret
}

BTW

    instructionBuffer[pos++] = (value >> 0) & 0xFF;
    instructionBuffer[pos++] = (value >> 8) & 0xFF;
    instructionBuffer[pos++] = (value >> 16) & 0xFF;
    instructionBuffer[pos++] = (value >> 24) & 0xFF;

can be replaced by

    *(DWORD*)(instructionBuffer + pos) = value ;
    pos += 4 ;
Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top