Question

I am trying to finish this PE file made with assembly only, which is supposed to show a message in the console. I want to organize it in such way I can add more things later easily (know where to add the code, data, imported functions).
I created 4 sections for now, for code, data, uninitiated data and imported elements. My main problems at this stage are:

  1. Some values in the section header make the executable invalid (no valid win32)
  2. Pointers to elements from the data section are wrong
  3. Some of the calculations that involve preferred absolute address, section alignment and file alignment might be wrong

Firstly, I will display all my code bellow. Some things that really don't matter won't be added in order to save time and make it easier to read This is NASM code

; Constants (use '$' as prefix)
$SECTION_ALIGNMENT equ 4096     ; Each section is aligned to 4096 in memory
$FILE_ALIGNMENT    equ 512      ; Each section is aligned to 512 on disk
$PREFERRED_ADDRESS equ 4194304  ; Preffered address for EXE is 4 MB
$TOTAL_PE_SECTIONS equ 4        ; Code, Data, Bss and IData
; Image size = headers aligned to section alignment + sections size aligned 
; to next multiple of section alignment, everything aligned, too
$IMAGE_SIZE        equ $SECTION_ALIGNMENT + (HEADERS_SIZE/$SECTION_ALIGNMENT) + \
                       $TOTAL_PE_SECTIONS * $SECTION_ALIGNMENT


; Will help us align some of the values to the next specified multiple
%define Round(Number, Multiple)  Multiple+(Number/Multiple)

section .header progbits vstart=0

    ; This is the MZ header
    DOS_HEADER:
        db             "MZ"         ; MZ signature

        ; ...
        ; Here we have all the members of the DOS header, in 4 paragraphs
        ; ...

        db             PE_HEADER    ; The last one is pointing to the PE header


    DOS_STUB:
        ; ...
        ; A small DOS program to display a simple message in DOS (64 bytes)
        ; ...

    ; This is the PE header
    PE_HEADER:
        db             "PE", 0, 0   ; PE signature
        dw              0x014C      ; Platform Intel I386
        dw              $TOTAL_PE_SECTIONS
        dd              1371668450  ; Creation timestamp
        dd              0           ; No symbols table
        dd              0           ; No symbols
        dw              SECTIONS_TABLE - OPT_HEADER  ; Optional header size
        dw              0x0002|0x0004|0x0008|0x0100|0x0200 ; Characteristics

    ; Optional header
    OPT_HEADER:
        dw              0x010B      ; Signature
        db              0           ; Linker version
        db              0           ; Minor linker version
        dd              CODE_SIZE
        dd              DATA_SIZE   ; Initialized data size
        dd              BSS_SIZE    ; Uninitiated data size
        dd              CODE        ; Entry point
        dd              CODE        ; Code RVA
        dd              DATA        ; Data RVA
        dd              $PREFERRED_ADDRESS ; Preferred address in memory
        dd              $SECTION_ALIGNMENT
        dd              $FILE_ALIGNMENT
        dw              4           ; OS version
        dw              0           ; Minor OS version
        dw              0           ; Image version
        dw              0           ; Minor image version
        dw              3           ; Subsystem version
        dw              10          ; Minor subsystem version
        dd              0           ; WIN32 version
        dd              $IMAGE_SIZE ; Image size calculated above
        dd              Round(HEADERS_SIZE, $SECTION_ALIGNMENT) ; Headers size
        dd              0           ; Checksum
        dw              3           ; System interface CUI
        dw              0           ; DLL characteristics
        dd              4096        ; Reserved stack
        dd              4096        ; Still not        ??
        dd              65536       ; sure about       ??
        dd              0           ; these            ??
        dd              0 
        dd              2           ; Data directory entries
        dd              0           ; Export table pointer
        dd              0           ; Export table size
        dd              I_TABLE     ; Import table pointer
        dd              I_TABLE_S   ; Size of import table
        dq              0           ; Reserved

    SECTIONS_TABLE:
        CODE_SECTION_HEADER:
            db          ".code", 0, 0, 0
            dd          Round(CODE_SIZE, $SECTION_ALIGNMENT) ; Size in memory
            dd          CODE
            dd          Round(CODE_SIZE, $FILE_ALIGNMENT) ; Size on disk
            dd          Round(0, $FILE_ALIGNMENT) ; Real start address
            dd          0
            dd          0
            dw          0
            dw          0
            dd          0x00000020|0x20000000|0x40000000|0x80000000

       DATA_SECTION_HEADER:
            db          ".data", 0, 0, 0
            dd          Round(DATA_SIZE, $SECTION_ALIGNMENT) ; Size in memory
            dd          $SECTION_ALIGNMENT * 2
            dd          Round(DATA_SIZE, $FILE_ALIGNMENT) ; Size on disk
            dd          Round(0, $FILE_ALIGNMENT) ; Real start address
            dd          0
            dd          0
            dw          0
            dw          0
            dd          0x00000040|0x40000000|0x80000000

       BSS_SECTION_HEADER:
            db          ".bss", 0, 0, 0, 0
            dd          Round(BSS_SIZE, $SECTION_ALIGNMENT) ; Size in memory
            dd          $SECTION_ALIGNMENT * 3
            dd          0
            dd          0
            dd          0
            dd          0
            dw          0
            dw          0
            dd          0x00000080|0x40000000|0x80000000


       IDATA_SECTION_HEADER:
            db          ".idata", 0, 0
            dd          Round(IDATA_SIZE, $SECTION_ALIGNMENT) ; Size in memory
            dd          $SECTION_ALIGNMENT * 4
            dd          0
            dd          Round(0, $FILE_ALIGNMENT) ; Real start address
            dd          0
            dd          0
            dw          0
            dw          0
            dd          0x00000040|0x40000000|0x80000000

    HEADERS_SIZE equ $$ - DOS_HEADER

    align   512 ; Align to 512 bytes in memory

section .scode vstart=$SECTION_ALIGNMENT align=16

    use32

    CODE:
        push -11
        call dword [$PREFERRED_ADDRESS + F_GetStdHandle]
        push 0  
        push 0x402000
        push 6  
        push $PREFERRED_ADDRESS + hello
        push eax  
        call dword [$PREFERRED_ADDRESS + F_WriteConsole]  
        push -1  
        call dword [$PREFERRED_ADDRESS + F_Sleep]  
        ret

    CODE_SIZE equ $$ - CODE

section .sdata vstart=$SECTION_ALIGNMENT*2 progbits align=4
        DATA:
            hello: db 'Hello!'
        DATA_SIZE equ $$ - DATA

section .sbss vstart=$SECTION_ALIGNMENT*3  align=4
    BSS:
        dd 5
    BSS_SIZE equ $$ - BSS

section .sidata vstart=$SECTION_ALIGNMENT*4 align=4
    IDATA:
        F_Sleep:          dd I_Sleep
        F_WriteConsole:   dd I_WriteConsole
        F_GetStdHandle:   dd I_GetStdHandle
        dd                0

        I_TABLE:  
            .originalfthk     dd 0
            .timedate         dd 0  
            .forwarder        dd 0  
            .name             dd kernel32
            .firstthunk       dd IDATA

        I_TABLE_S equ $$ - I_TABLE

        times 20 db 0

        kernel32:             db 'kernel32.dll', 0
        I_Sleep:           
            dw                0  
            db                'Sleep', 0  
            align             2  
        I_WriteConsole:    
            dw                0  
            db                'WriteConsoleA', 0
            align             2  
        I_GetStdHandle:
            dw                0  
            db                'GetStdHandle', 0

    IDATA_SIZE equ $$ - IDATA

The main problem here is that the executable crashes because the pointers from the code section are wrong. I'm talking about the pointer to the hello message from .sdata and the pointers to the imported functions, from the .sidata section. If I copy both the hello variable and the entire content of .sidata into .scode (bellow ret) it works but as soon as I copy each thing in what is supposed to be its proper section, the exe breaks.
So, it looks like the address is miscalculated. Starting from here, there could be wrong values in the sections header or somewhere else. What do you think?

UPDATE:
There's a problem I have now, after implementing the changes below. Everything works fine as long as the .data section is less than 512 bytes. Once it exceeds that I get a 'weird invalid win32 application' error.

So, here I have 2 HTML files exported by PEInfo. This first one contains the information of the working file (where the .data section is less than 512 bytes): Working EXE PEInfo
The second one contains the information of the corrupt EXE, when the .data section contains more than 512 bytes: Corrupt EXE PEInfo

Maybe someone can spot the difference and reason of the crash.

Was it helpful?

Solution

I've now had a chance to look at the code in detail and actually get it to run. So here are all the problems I've found.

First off, none of your size calculations seemed to work. You do something like this:

CODE_SIZE equ $$ - CODE

But you try and reference that CODE_SIZE before the line where it was defined, so it just evaluates as zero.

My solution was to add end labels, e.g. CODE_END:, wherever you would usually have performed one of those calculations. Then at the very start of the code, before these values are used, calculate the size as the difference between the end label and the start label for each block.

HEADERS_SIZE  equ HEADERS_END - DOS_HEADER
CODE_SIZE     equ CODE_END - CODE
DATA_SIZE     equ DATA_END - DATA
IDATA_SIZE    equ IDATA_END - IDATA
I_TABLE_SIZE  equ I_TABLE_END - I_TABLE

The next big problem was your Round macro, which looked like this:

%define Round(Number, Multiple)  Multiple+(Number/Multiple)

I'm not sure what you thought you were doing there, but this is more like what you need:

%define Round(Number, Multiple) (Number+Multiple-1)/Multiple*Multiple

You want to make sure the Number is a multiple of Multiple, hence the divide-multiply sequence. You also need to add Multiple-1 to the original Number to force it to round up.

The next big problem is the RVA calculations, or lack thereof. There are lots of places in the file structure that need you to specify an offset as a relative virtual address (RVA), which is the relative offset in memory. When you just take the value of a label as is, that's giving you the offset on disk.

For a section offset, you basically need to divide that offset by the file alignment and then multiply it by the section alignment. Also, the code block is going to load at one section alignment offset, so everything should be calculated relative to the code block and then add one section alignment to the result.

%define RVA(BaseAddress) (BaseAddress - CODE)/$FILE_ALIGNMENT*$SECTION_ALIGNMENT+$SECTION_ALIGNMENT

Now that works for addresses on section boundaries. For anything else, you need to calculate their internal offset relative to their section base address, and then add that to the RVA of that section.

%define RVA(Address,BaseAddress) RVA(BaseAddress)+(Address-BaseAddress)

These calculations assume that the various sections are already aligned with the $FILE_ALIGNMENT value, but that's not actually the case. You had an align before the code section that looked like this:

align   512 ; Align to 512 bytes in memory

But you need to that before every single section as well as one on the end of the file. I would also recommend using the $FILE_ALIGNMENT constant, otherwise there's no point in having that.

align   $FILE_ALIGNMENT ; Align to 512 bytes in memory

In addition to that, you need to get rid of all your section declarations. For example, all of these lines need to be removed.

section .header progbits vstart=0
section .scode vstart=$SECTION_ALIGNMENT align=16
section .sdata vstart=$SECTION_ALIGNMENT*2 progbits align=4
section .sbss vstart=$SECTION_ALIGNMENT*3  align=4
section .sidata vstart=$SECTION_ALIGNMENT*4 align=4

Since you're building the whole file format manually, they serve no purpose, and they prevent you doing offset calculations with labels that cross sections boundaries (something we need pretty much everywhere).

Now that we have everything aligned correctly and we have our two RVA macros, we can start fixing up the various parts of the code that need to use RVAs.

First in the optional header, we have the code RVA, the data RVA and the entry point. Also, while we are there, I believe the various size values should be specified as multiple of the section alignment.

dd  Round(CODE_SIZE, $SECTION_ALIGNMENT)
dd  Round(DATA_SIZE, $SECTION_ALIGNMENT) ; Initialized data size
dd  Round(BSS_SIZE, $SECTION_ALIGNMENT)  ; Uninitiated data size
dd  RVA(CODE)                            ; Entry point
dd  RVA(CODE)                            ; Code RVA
dd  RVA(DATA)                            ; Data RVA

Also in the optional header, you have the header size rounded to the section alignment when I believe it should be rounded to the file alignment.

dd  Round(HEADERS_SIZE, $FILE_ALIGNMENT) ; Headers size

This is one of those things that actually doesn't make any difference - the code will work either way - but I still think it's wrong and should be corrected.

Similarly, as I pointed out in my first answer, the data directory table size should always be set to 16 even if you don't use all 16 entries. It does seem to work if you don't do that, but again I would recommend you do it correctly.

dd  16                 ; Data directory entries
dd  0                  ; Export table pointer
dd  0                  ; Export table size
dd  RVA(I_TABLE,IDATA) ; Import table pointer
dd  I_TABLE_SIZE       ; Size of import table
times 14 dq 0          ; Space the other 14 entries

Also, note the I_TABLE offset has been updated to use an RVA relative to the IDATA section.

Next in you sections table, all of your offsets are wrong. For example the start of the code section header should look like this:

db  ".code", 0, 0, 0
dd  Round(CODE_SIZE, $SECTION_ALIGNMENT) ; Size in memory
dd  RVA(CODE)                            ; Start address in memory
dd  Round(CODE_SIZE, $FILE_ALIGNMENT)    ; Size on disk
dd  CODE                                 ; Start address on disk

Similarly for the data section:

db  ".data", 0, 0, 0
dd  Round(DATA_SIZE, $SECTION_ALIGNMENT) ; Size in memory
dd  RVA(DATA)                            ; Start address in memory
dd  Round(DATA_SIZE, $FILE_ALIGNMENT)    ; Size on disk
dd  DATA                                 ; Start address on disk

And the idata section:

db  ".idata", 0, 0
dd  Round(IDATA_SIZE, $SECTION_ALIGNMENT) ; Size in memory
dd  RVA(IDATA)                            ; Start address in memory
dd  Round(IDATA_SIZE, $FILE_ALIGNMENT)    ; Size on disk
dd  IDATA                                 ; Start address on disk

The bss section is slightly different though. The whole point of the bss section is that is takes up no space on disk, but it does take up space in memory. This means that you can't actually include any data definitions for your bss data. So this code must go:

BSS:
    dd 5 

But this means the sections on disk won't match up with the sections in memory. To keep the RVA calculations simple, my suggested workaround is to have the bss section as the very last thing in the file. When it's size expands from 0 on disk, to whatever in memory that won't effect any other offsets.

So I would add a label at the very end of the file called IMAGE_END: and then define the bss section like this:

db  ".bss", 0, 0, 0, 0
dd  Round(BSS_SIZE, $SECTION_ALIGNMENT) ; Size in memory
dd  RVA(IMAGE_END)                      ; Start address in memory
dd  0                                   ; Size on disk
dd  0                                   ; Start address on disk

Note that this section must come after the idata section in the sections table since the addresses need to be in ascending order.

You may be wondering where the BSS_SIZE value comes from if you don't have a bss section in the code anymore. I've afraid you're going to have to define that value manually. You're also going to have to manually define constants for the offsets of any variables in that section. As I said before, you can't use data definitions because we don't want it taking up any space on disk.

Next we get to the import table. The layout you're using for this is somewhat strange, but that doesn't seem to be a problem, so I'm going to leave that as is. You do need to update all the addresses to use RVAs though.

First the IAT:

F_Sleep:         dd RVA(I_Sleep,IDATA)
F_WriteConsole:  dd RVA(I_WriteConsole,IDATA)
F_GetStdHandle:  dd RVA(I_GetStdHandle,IDATA)

Then the import descriptor:

.originalfthk    dd 0
.timedate        dd 0  
.forwarder       dd 0  
.name            dd RVA(kernel32,IDATA)
.firstthunk      dd RVA(IDATA,IDATA)

I should also mention that you were setting the I_TABLE_S variable immediately after this descriptor, and if you recall, I said you should be replacing these size calculations with end labels. However, in this case the size of the descriptor table is supposed to include the final zero entry too. So the correct place to put that end label is not here, but after the times 20 db 0 padding.

times 20 db 0    
I_TABLE_END:

This is another one of those things which I don't think makes much difference but I'd still recommend fixing.

Also, this layout is fine when you're importing from one DLL, but when you need more than that, you're going to need more descriptors, and more IAT sections. I would therefore recommend adding a label before each IAT, e.g. something like kernel32_iat in this case. Then you initialise your first thunk as.

.firstthunk      dd RVA(kernel32_iat,IDATA)

Finally, I want to deal with the $IMAGE_SIZE calculation. The calculation you're using assumes a fixed size for each section. But given we have an IMAGE_END label at the end of the file, and a RVA macro, we can easily calculate the exact image size as RVA(IMAGE_END).

However, that doesn't take into account the bss section which makes the image bigger once it is loaded into memory. So the correct definition for the image size should be:

$IMAGE_SIZE equ RVA(IMAGE_END) + Round(BSS_SIZE,$SECTION_ALIGNMENT)

Note that this should be defined near the beginning of the file - before it is used anywhere but after the RVA macro and the BSS_SIZE have been defined.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top