Thursday, March 02, 2006

Compiler reworking in progress

My most recent patches in the DARCS repository form the beginning of a major compiler reworking. Right now, most optimizations are disabled, so performance is not very good. As I update the optimizations for the new design and implement new ones, performance will improve beyond the level it was prior to starting this refactoring.

Originally I wanted to finish the Cocoa interface and implement the multi-window, native-backend UI in 0.81, however after hitting some bugs I decided to work on the compiler for a while instead.

My goals are several-fold:
  • First, I want to fix a tricky bug in the branch folding optimization. Doing this properly seems to require changing the IR in some fashion.
  • Second, I'm getting rid of unconditional branch splitting. Branch splitting is the process of converting
    A [ B ] [ C ] if D

    to
    A [ B D ] [ C D ] if

    In Factor 0.80 the compiler splits all branches, simplifying the IR and code generation by in some cases leading to an exponential increase in code size and compile time. Now branches will no longer be split. This requires changing code that traverses the IR.
  • Consecutive math primitives should store intermediate values in registers where possible. Currently this is not really done very well at all. I want to improve this capability using a real register allocator, and also implement floating point intrinsics.
  • I'm not sure I'll get this done for 0.81, but eventually I want to add a new 'complex float' type to the runtime. Instances of this type will be created automatically when a complex number is constructed from a pair of floats, and they will be represented efficiently; the two components will not be boxed. I plan on writing intrinsics for SSE2 and Altivec units on x86 and PowerPC, respectively, which operate on complex float values. This will be transparently integrated with the number tower and provide a performance boost for several benchmarks including Mandelbrot, and of course, any real code using complex floats.

I will probably release Factor 0.81 as soon as the new compiler implementation is working and offers comparable performance to 0.80. Factor 0.81 already offers enough in the form of new features; callbacks are a major step forward in terms of functionality. In 0.82, I will continue tweaking the optimizer to realize the full benefit of this refactoring, and then return to evolving the UI.

No comments: