Tuesday, May 09, 2006

Floating point intrinsics working on PowerPC

The compiler now inlines machine code for floating point operations instead of calling the runtime. This results in a performance improvement on my 2.5 Ghz PowerPC G5:
BenchmarkBeforeAfter
Mandelbrot10.55.4
Raytracer61.141

Here are the x86 results (Pentium 4 1.8 GHz):
BenchmarkBeforeAfter
Mandelbrot8.74.3
Raytracer44.333.2

The Before column is 0.81, the After column is 0.82, and times are in seconds.
Yes, a five year old box beats my brand new G5, but I suspect this is because my compiler doesn't do instruction scheduling.
Bigger gains will come as the higher levels of the compiler improve, resulting in more inlining.

No comments: