Thursday, April 08, 2010

Frame-based structured exception handling on Windows x86-64

Factor used to use vectored exception handlers, registered with AddVectoredExceptionHandler, however vectored handlers are somewhat problematic. A vectored handler is always called prior to any frame-based handlers, so Factor could end up reporting bogus exceptions if the FFI is used to call a library that uses SEH internally. This prompted me to switch to frame-based exception handlers. Unfortunately, these are considerably more complex to use, and the implementation differs between 32-bit and 64-bit Windows.

I briefly discussed frame-based SEH on 32-bit Windows in my previous post. During the switch to 64 bits, Microsoft got rid of the old frame-based SEH implementation and introduced a new, lower-overhead approach. Instead of pushing and popping exception handlers onto a linked list at runtime, the system maintains a set of function tables, where each function table stores exception handling and stack frame unwinding information.

Normally, the 64-bit Windows function tables are written into the executable by the native compiler. However, language implementations which generate code at runtime need to be able to define new function tables dynamically. This is done with the RtlAddFunctionTable() function.

It took me a while to figure out the correct way to call this function. I found the os_windows_x86.cpp source file from Sun's HotSpot Java implementation was very helpful, and I based my code on the register_code_area() function from this file.

Factor and HotSpot only use function tables in a very simple manner, to set up an exception handler. Function tables can also be used to define stack unwinding behavior; this allows debuggers to generate backtraces, and so on. Doing that is more complicated and I don't understand how it works, so I won't attempt to discuss it here.

The RtlAddFunctionTable() function takes an array of RUNTIME_FUNCTION structures and a base address. For some unknown reason, all pointers in the structures passed to this function are 32-bit integers relative to the base address.

For a runtime compiler that does not need to perform unwinding, it suffices to map the entire code heap to one RUNTIME_FUNCTION. A RUNTIME_FUNCTION has three fields:

  • BeginAddress - the start of the function
  • EndAddress - the end of the function
  • UnwindData - a pointer to unwind data

All pointers are relative to the base address passed into RtlAddFunctionTable(). The unwind data can take various forms. For the simple case of no unwind codes and an exception handler, the following structure is used:

struct UNWIND_INFO {
UBYTE Version:3;
UBYTE Flags:5;
UBYTE SizeOfProlog;
UBYTE CountOfCodes;
UBYTE FrameRegister:4;
UBYTE FrameOffset:4;
ULONG ExceptionHandler;
ULONG ExceptionData[1];

The Version and Flags fields should be set to 1, the ExceptionHandler field set to a function pointer, and the rest of the fields set to 0. The exception handler pointer must be within relative to the base address, and it must also be within the memory range specified by the BeginAddress and EndAddress fields of the RUNTIME_FUNCTION structure. The exception handler has the same function signature as in the 32-bit SEH case:

LONG exception_handler(PEXCEPTION_RECORD e, void *frame, PCONTEXT c, void *dispatch)

In both Factor and HotSpot, the exception handler is a C function, however the RtlAddFunctionTable() API requires that it be within the bounds of the runtime code heap. To get around the restriction, both VMs allocate a small trampoline in the code heap which simply jumps to the exception handler, and use a pointer to the trampoline instead. Similarly, because the "pointers" in these structures are actually 32-bit integers, it helps to allocate the RUNTIME_FUNCTION and UNWIND_INFO in the code heap as well, to ensure that everything is within the same 4Gb memory range.

The above explanation probably didn't make much sense, so I invite you to check out the source code instead: os-windows-nt-x86.64.cpp.


Yuhong Bao said...

Well, there is a global flag to create stack traces for debugging purposes. There was a blog entry about how enabling this crashed HotSpot when it caused ntdll functions to call RtlCaptureStackBackTrace which relies on the data (that should tell you something about this code right there):

Yuhong Bao said...

BTW, the reason the pointers are relative to the base address is that these tables have to be able to be stored on disk in the PE file itself, I think