Friday, August 11, 2006

Automatically recompiling words

In Factor, new word definitions run in the interpreter until they are explicitly compiled. Furthermore, if a compiled word is redefined, then all words which call it must be recompiled to pick up the new definition.

Previously, when a word was redefined all words which call it would immediately revert to their interpreted definition. This caused two problems. The first one is only relevant to me: if I redefine certain words which are depended on by alien-invoke, then all C library calls will become decompiled, and since C library calls cannot be made from the interpreter, this would usually crash the I/O system. I hacked around this by simply not decompiling words that call alien-invoke, but the same problem exists for other words which are compiler-only, such as alien-callback. The second issue with the decompiling behavior is more annoying: if you load a module which defines methods on core generic words such as nth which are called pretty much everywhere -- for instance contrib/math/ does this, then you're in for a lengthy compile time after the module is loaded, since most words in the system would revert to their interpreter definition and the compiler would run slowly.

Factor 0.84 works differently in this regard. When a word is redefined, the word and all words that call it are added to a "changed words" set, but no further action is taken. At a later point, the user can call recompile, which recompiles the changed words. At no point does any word other than the word being immediately redefined revert to its interpreted definition, so this is faster than the old way. Also there is no problem with alien-invoke and similar compiler-only words; no special action has to be taken.

Words defined in the listener are not automatically compiled. I figured this might be too annoying, given that the compiler is somewhat verbose and all. If people want this feature, I will implement it though.

I made run-file call recompile, so now loading a source file will automatically compile any words defined in that file, together with words which call those words. When loading modules, only one call to recompile is made at the end.

Not all words in the library compile, still. Compiled continuations will increase this number, and at some point I might implement a non-optimizing compiler for words without a static stack effect. The performance gain there might not be worth the added complexity, though. It would make Factor appear cleaner to beginners, though -- I regularly have to explain that the inference errors shown during bootstrap are benign.

Of course the extra compiler activity will mean that Factor will eventually run out of code heap space. This puts pressure on me to implement code GC, which will be coming in 0.85.

I haven't pushed the automatic recompile patches to the main repository yet because I have to work out a few bugs. Nothing serious.


Anonymous said...

How and when do you keep track which words are called by a newly defined (or redefined) word?

Also, didn't Self have some fancy dependency recording going on to enable "de-optimization" (redefining inlined methods, etc.)?

Slava Pestov said...

Factor maintains a cross-referencing table -- basically a hashtable of hashtables, tracking which words call a given word. This is how the "usage" word works. I imagine Self does something similar but more sophisticated, but I'm not sure their cross-referencing info serves a dual purpose as a developer tool.