Wednesday, May 09, 2007

Mac OS X data corruption bug

Another day, another bug in the infrastructure underlying Factor.

I submitted this one to Apple. This test case demonstrates the problem; if you're doing memcpy() when there is a probability that an interval timer set by setitimer() might be fired, then you're screwed. This problem is reproducible on PowerMac G5's, but not PowerMac G4's or Intel Macs. So it might be that the signal handling code is not saving/restoring the G5's AltiVec registers correctly (memcpy() uses AltiVec).

How did I come across this bug? I'm working on a (statistical sampling) profiler. It can profile both interpreted and compiled code. Except on Mac OS X/PPC, it kills Factor with a GC assertion...

I love Mac OS X, I really do. But I wish the basic stuff like signal handlers, threads, and so on, was implemented properly.

2 comments:

Anonymous said...

would it help to change the underlying programming from C to D ? or is it only an OS thing here?
I noticed you were facing problems with gcc.
it seems D is cleaner in design, even though it is still changing sometimes.

Slava Pestov said...

This is an OS bug. Changing the language would not help here, in fact rewriting the runtime from C to D would be a huge undertaking; this would involve not only porting the code, but implementing a D FFI for the compiler.