The first optimization involves removing such redundant code sequences as
0 bitor. The old optimizer performed this unconditionally, which was incorrect since most of these identities fail to hold for floats. For example,
-0.0 0 +is
0.0, so in this case
0 +is not a no-op, and
1.0 0.0 / 0.0 *is
0.0 *does not always yield zero. However, they are valid for integers, so the new optimizer only applies these optimizations to integer-valued expressions.
The second optimization is described in the blog post I linked to.
I also found and fixed a code generator regression which has been there for ages, and a type inference regression which appeared with the new code generator. Fixing these two regressions and implementing the two missing optimizations above improved performance of
benchmark.raytracerby almost 2x; the run time went from 12 seconds to 6.5 seconds on my 2.4 GHz Core 2 Duo MacBook Pro.
The Java version runs in 1.1 seconds with the server VM (I modified the code to run the benchmark 10 times in a loop to give the JIT a chance to do some runtime profiling, and I measured the run time internally so as to avoid JVM startup overhead). The Common Lisp version runs in 3.5 seconds under SBCL. Note that I'm running the benchmark with
Factor is slowly inching towards the performance offered by other compilers. Hopefully one day I'll be able to close the gap.