Struct RUNTIME_FUNCTION

Question 1

You can find more information on RUNTIME_FUNCTION and related structures at Microsoft's MSDN.

These structures are generated by the compiler and used to implement structured exception handling. During the execution of your code an exception may occur, and the runtime system needs to be able to walk up the call stack to find a handler for that exception. To do so, the runtime system needs to know the layout of the function prologs, which registers they save, in order to correctly unwind the individual function stack frames. More details are here.

The RUNTIME_FUNCTION is the structure which describes a single function, and it contains the data required to unwind it.

If you generate code at runtime and need to make that code available to the runtime system (because your code calls out to already compiled code which may raise an exception) then you create RUNTIME_FUNCTION instances for each of your generated functions, fill in the UNWIND_INFO for each, and then tell the runtime system by calling RtlAddFunctionTable.

Question 2

Windows x64 SEH

The compiler puts an exception directory in the .pdata section of an .exe image, but it also can be placed in any section such as .rdata and it is pointed to by the PE header NtHeaders64->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXCEPTION].VirtualAddress. The compiler fills the exception directory with RUNTIME_FUNCTIONs.

typedef struct _RUNTIME_FUNCTION {
 ULONG BeginAddress;
 ULONG EndAddress;
 ULONG UnwindData;
} RUNTIME_FUNCTION, *PRUNTIME_FUNCTION;

Each RUNTIME_FUNCTION describes a function in the image. Every function in the program (apart from leaf functions) has one, regardless of whether there is a SEH exception clause in it, because an exception can occur in a callee function, therefore you need unwind codes to get to a caller function which might have the SEH handler, so most functions will have unwind codes but not a scope table. BeginAddress points to the start of the function and EndAddress points to the end of the function.

UnwindData points to an _UNWIND_INFO table structure.

typedef struct _UNWIND_INFO {
    UBYTE Version         : 3;
    UBYTE Flags           : 5;
    UBYTE SizeOfProlog;
    UBYTE CountOfCodes;  //so the beginning of ExceptionData is known as they're both FAMs
    UBYTE FrameRegister  : 4;
    UBYTE FrameOffset    : 4;
    UNWIND_CODE UnwindCode[1];
    union {
        //
        // If (Flags & UNW_FLAG_EHANDLER)
        //
        OPTIONAL ULONG ExceptionHandler;
        //
        // Else if (Flags & UNW_FLAG_CHAININFO)
        //
        OPTIONAL ULONG FunctionEntry;
    };
    //
    // If (Flags & UNW_FLAG_EHANDLER)
    //
    OPTIONAL ULONG ExceptionData[]; 
} UNWIND_INFO, *PUNWIND_INFO;

Flags can be one of:

#define UNW_FLAG_NHANDLER 0
#define UNW_FLAG_EHANDLER 1
#define UNW_FLAG_UHANDLER 2
#define UNW_FLAG_FHANDLER 3
#define UNW_FLAG_CHAININFO 4

If UNW_FLAG_EHANDLER is set then ExceptionHandler points to a generic handler called __C_specific_handler (which is an import from libcmt.lib) whose purpose is to parse the ExceptionData which is a flexible array member of type SCOPE_TABLE. If UNW_FLAG_UHANDLER is set then it indicates the __C_specific_handler is also to be used to call a finally block, i.e. there is a finally block within the function. If the UNW_FLAG_CHAININFO flag is set, then an unwind info structure is a secondary one, and contains an image-relative pointer in the shared exception handler/chained info address field which points to the RUNTIME_FUNCTION entry pointing to the primary unwind info. This is used for noncontiguous functions. UNW_FLAG_FHANDLER indicates it is a 'frame handler' and I don't know what that is.

typedef struct _SCOPE_TABLE {
 ULONG Count;
 struct
 {
     ULONG BeginAddress;
     ULONG EndAddress;
     ULONG HandlerAddress;
     ULONG JumpTarget;
 } ScopeRecord[1];
} SCOPE_TABLE, *PSCOPE_TABLE;

The SCOPE_TABLE structure is a variable length structure with a ScopeRecord for each try block in the function and contains the start and end address (probably RVA) of the try block. HandlerAddress is an offset to code that evaluates the exception filter expression in the parenthesis of __except (EXCEPTION_EXECUTE_HANDLER means always run the except, so it's analogous to except Exception) and JumpTarget is the offset to the first instruction in the __except block associated with the __try block. CountOfCodes is needed because UnwindCode is also a flexible array member and there's no other way of knowing where the data after this flexible array member begins. If it is a try/finally block, then because there is no filter in a finally, HandlerAddress instead of JumpTarget is used to point to an copy of the finally block that is embellished with a prologue and epilogue (copy is needed for when it is called in the context of an exception rather than normally after arriving at the end of the try block -- which can't happen with an exception, because it's never run after successful completion, so the exception block is always separate and there's no original copy).

Once the exception is raised by the processor, the exception handler in the IDT will pass exception information to a main exception handling function in Windows, which will find the RUNTIME_FUNCTION for the offending instruction pointer and call the ExceptionHandler. If the exception falls within the function and not the epilogue or prologue then it will call the __C_specific_handler. __C_specific_handler will then begin walking all of the SCOPE_TABLE entries searching for a match on the faulting instruction, and will hopefully find an __except statement that covers the offending code. (Source)

To add to this, for nested exceptions I'd imagine the __C_specific_handler would always find the smallest range that covers the current faulting instruction and will unwind through the larger ranges of the exception is not handled. The implementation of the __C_specific_handler on the source above shows a simple iteration through the records which would not happen in practice.

It is also not made clear how the OS Exception handler knows which dll's exception directory to look in. I suppose it could use the RIP and consult the process VAD and then get the first address of the particular allocation and call RtlLookupFunctionEntry on it. The RIP may also be a kernel address in a driver or ntoskrnl.exe; in which case, the windows exception handler will consult the exception directory of those images, but I'm not sure how it gets the image base from the RIP as kernel allocations aren't tracked in a VAD.

Exception Filters

An example function that uses SEH:

BOOL SafeDiv(INT32 dividend, INT32 divisor, INT32 *pResult)
{
    __try 
    { 
        *pResult = dividend / divisor; 
    } 
    __except(GetExceptionCode() == EXCEPTION_INT_DIVIDE_BY_ZERO ? 
         EXCEPTION_EXECUTE_HANDLER : EXCEPTION_CONTINUE_SEARCH)
    { 
        return FALSE;
    }
    return TRUE;
}

Let's say catch (ArithmeticException a){//do something} in Java were C++ code, it would translate to the following C++ code and then compile (only theoretically, because in reality EXCEPTION_INT_DIVIDE_BY_ZERO doesn't seem to be produced by the compiler for any exception object)

__except(GetExceptionCode() == EXCEPTION_INT_DIVIDE_BY_ZERO ? 
         EXCEPTION_EXECUTE_HANDLER : EXCEPTION_CONTINUE_SEARCH) {//do something}

The filter string in the parenthesis is pointed to by the HandlerAddress value in the scope record for the try block. The filter is always either equal to EXCEPTION_CONTINUE_SEARCH, EXCEPTION_EXECUTE_HANDLER or EXCEPTION_CONTINUE_EXECUTION. GetExceptionCode gets the ExceptionCode (windows specific error constant) from the _EXCEPTION_RECORD which was probably created by the specific exception handler in the IDT using the error code and exception no. (_EXCEPTION_RECORD is stored somewhere such that it is accessible through the call). It is compared against the specific error, EXCEPTION_INT_DIVIDE_BY_ZERO being what would be used by ArithmeticException. If the filter expression evaluates to EXCEPTION_EXECUTE_HANDLER then it will jump to JumpTarget; otherwise, if it evaluates to EXCEPTION_CONTINUE_SEARCH and I'd imagine the __C_specific_handler looks for a ScopeRecord with a wider scope. If it runs out of ScopeRecords that cover the RIP of the faulting instruction then the __C_specific_handler returns EXCEPTION_CONTINUE_SEARCH and the windows exception handler unwinds the stack prologue and continues with the new RIP in the context record which it changes while unwinding, checking the _RUNTIME_FUNCTION structs.

There is a SEH block in mainCRTStartup, but not in BaseThreadInitThunk. Eventually, the base of the stack will be reached – RtlUserThreadStart, which has a filter expression containing a call to RtlpUnhandledExceptionFilter(GetExceptionInformation()) by the OS (RtlpUnhandledExceptionFilter is initialised to UnhandledExceptionFilter in kernel32!_BaseDllInitialize, and GetExceptionInformation is passed in rcx to the filter expression HandlerAddress by the _C_specific_handler), which will call the filter specified in SetUnhandledExceptionFilter, which is the variable BasepCurrentTopLevelFilter (which is what SetUnhandledExceptionFilter sets), which gets initialised on the dynamic linking of kernel32.dll. If the application is not currently being debugged then the user specified unhandled filter will be called and must return EXCEPTION_EXECUTE_HANDLER, which causes the except block to be called by __C_specific_handlerand the except block terminates the whole process using ZwTerminateProcess.

Prologue and Epilogue exceptions

Within a function described by a _RUNTIME_FUNCTION structure, an exception can occur in the prologue or the epilogue of the function as well as well as in the body of the function, which may or may not be in a try block. The prologue is the part of the function that saves registers, stores parameters on the stack (if -O0). The epilogue is the reversal of this process, i.e. returning from the function. The compiler stores each action that takes place in the prologue in an UnwindCodes array; each action is represented by a 2 byte UNWIND_CODE structure which contains a member for the offset in the prologue (1 byte), unwind operation code (4 bits) and operation info (4 bits).

After finding a RUNTIME_FUNCTION for which the RIP is between the BeginAddress and EndAddress, before invoking __C_specific_handler, the OS exception handling code checks whether the RIP lies between BeginAddress and BeginAddress + SizeOfProlog of the function defined in the RUNTIME_FUNCTION and _UNWIND_INFO structures respectively. If it is then it is in the prologue and looks at the UnwindCodes array for the first entry with an offset less than or equal to the offset of the RIP from the function start. It then undoes all of the actions described in the array in order. One of these actions might be UWOP_PUSH_MACHFRAME which signifies that a trap frame has been pushed, which might be the case in kernel code. The ultimate result is restoring the RIP to the what it was before the call instruction was executed by eventually undoing the call instruction, as well as restoring the values of other registers to what they were before the call. While doing so, it updates the CONTEXT_RECORD. The process is restarted using the RIP before the function call once the actions have been undone; the OS exception handling will now use this RIP to find the RUNTIME_FUNCTION which will be that of the calling function. This will now be in the body of the calling function so the __C_specific_handler of the parent _UNWIND_INFO can now be invoked to scan the ScopeRecords i.e. the try blocks in the function.

If the RIP is not in the range BeginAddress – BeginAddress + SizeOfProlog then it examines the code stream after RIP and if it matches to the trailing portion of a legitimate epilogue then it's in an epilogue and the remaining portion of the epilogue is simulated and it updates the CONTEXT_RECORD as each instruction is processed. The RIP will now be the address after the call instruction in the calling function, hence it will search the RUNTIME_FUNCTION for this RIP and it will be the parent's RUNTIME_FUNCTION, and then the scope records in that will be used to handle the exception.

If it is neither in a prologue or epilogue then it invokes the __C_specific_handler in the unwind info structure to examine the try block scope records. If there are no try blocks in the function then there will be no handler (when the UNW_FLAG_EHANDLER bit is set, the ExceptionHandler field of the UNWIND_INFO structure is assumed to be valid, and in this case it will be UNW_FLAG_EHANDLER instead), and if there are try blocks but the RIP is not within the range of any try block, then the whole prologue is unwound. If it is within a try block, then it evaluates the filter evaluating code pointed to by HandlerAddress and based on the value returned by that code, the __C_specific_handler either looks for a parent scope record if the return value is EXCEPTION_CONTINUE_SEARCH (and if there isn't one, unwinds the prologue and looks for a parent RUNTIME_FUNCTION) (by parent I mean an encapsulating try scope and an a caller function) and if the return value is EXCEPTION_EXECUTE_HANDLER then it jumps to JumpTarget. If this is a try/finally block, it will just jump to HandlerAddress (instead of evaluating a filter expression), which is the finally code, and then it is done.

Another scenario that is worth mentioning is if the function is a leaf function it will not have a RUNTIME_FUNCTION record because a leaf function does not call any other functions or allocate any local variables on the stack. Hence, RSP directly addresses the return pointer. The return pointer at [RSP] is stored in the updated context, the simulated RSP is incremented by 8 and then it looks for another RUNTIME_FUNCTION.

Unwinding

When the __C_specific_handler returns EXCEPTION_CONTINUE_SEARCH rather than EXCEPTION_EXECUTE_HANDLER, it needs to return from the function, which is called unwinding -- it needs to undo the prologue of the function. The opposite of unwinding is 'simulating', and that's done to the epilogue. To do so, it calls the handler in ExceptionHandler, which is __C_specific_handler, which goes through the UnwindCode array as stated earlier and undoes all of the actions to restore the state of the CPU to before the function call -- it doesn't have to worry about locals because they'll be lost to the aether when it moves down a stack frame. The unwind code array is used to unwind (modify) the context record which was initially snapshotted by the windows exception handler. It then looks at the new RIP in the context record which will fall in the range of the RUNTIME_FUNCTION of the parent function and it will call the __C_specific_handler. If the exception gets handled then it passes control to the except block at JumpTarget and execution continues as normal. If it is not handled (i.e. the filter expression does not evaluate to EXCEPTION_EXECUTE_HANDLER then it continues unwinding the stack until it reaches RtlUserThreadStart and the RIP is in the bounds of that function, which means the exception is unhandled.

There is a very good diagrammatic example of this on this page.

IDA pro seems to show an __unwind{} clause when there is either an exception or termination handler present and the function has unwind codes.

Windows x86 SEH

x86 uses stack based exception handling rather than table based which x64 uses. This made it vulnerable to buffer overflow attacks //i'll continue later