Question

I'm trying to code a exe packer/protector as a way of learning more about assembler, c++, and how PE files work. I've currently got it working so the section containing the EP is XORed with a key and a new section is created that contains my decryption code. Everything works out great except when I try and JMP to the original EP after decryption.

Basically I do this:

DWORD originalEntryPoint = optionalHeader->AddressOfEntryPoint;
// -- snip -- //
    crypted.put(0xE9);
 crypted.write((char*)&orginalEntryPoint, sizeof(DWORD)); 

But instead of it jumping to the entry point, ollydbg shows that this code disassembles to:

00404030   .-E9 00100000    JMP 00405035 ; should be 00401000 =[

and when I try to change it manually in olly the new opcode shows up as

00404030    -E9 CBCFFFFF    JMP crypted.00401000

Where did 0xCBCFFFFF come from? How would I generate that from the C++ side?

Was it helpful?

Solution

I think that E9 is an opcode for a relative jump: its operand specifies a relative distance to be jumped, plus or minus from the start of the next instruction.

If you want the operand to specify an absolute address, you would need a different opcode.

OTHER TIPS

you could use:

mov eax,DESTINATION_VA
jmp eax                ; pick any register the destination doesn't care about

or

push DESTINATION_VA
ret                    ; not recommended for performance

This and the next up-to-16 ret instructions going back up the call tree higher than this depth will mispredict, unless they were pushed off the return-address predictor stack by a deeper call depth. (Current CPUs typically have a 16-entry predictor stack).


relative E9 jmp encoding is used like this:

CURRENT_RVA: jmp (DESTINATION_RVA - CURRENT_RVA - 5 [sizeof(E9 xx xx xx xx)])

push + ret is the smallest solution if you have VA address and the image is not relocated, but it's still 6 bytes so it's larger than a direct jmp rel32.

Register-indirect is probably the most efficient if you can't use a normal direct jmp.

opcode for absolute indirect jump is FF + 4byte address. This is most often used for jumptables of addresses stored in data.

Absolute addresses do require relocation when not loaded to the expected address, so relative addresses are generally preferred. Code for relative jumps is also 2 bytes smaller.

Intel optimization manual states that the cpu expects call and ret to be used in pairs, so the ret without a call suggested in answer 2 would cause what they call a "performance penalty".

Also, if the code was not loaded to the same address that the compiler assumed, the ret would probably crash the program. It would be safer to calculate a relative address.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top