Tuesday, August 14, 2007

Re-definitions, tuples, and constructors

Suppose you have a source file A.factor:
IN: A

GENERIC: some-generic ( obj -- )

M: some-class some-generic ... ;

And another source file B.factor:
IN: B

M: some-other-class some-generic ... ;

If we load A.factor, then B.factor, then load A.factor again, we would like it if the method defined in B.factor was still defined. So indeed, if the parser sees a re-definition of a generic word, it does not blow away the existing methods. This is an important feature, and it means I can change kernel.factor, for example, and reload it, without blowing away tons of methods on common generic words such as equal?. One expects that re-definitions are idempotent in the sense that re-entering an existing definition does not affect other definitions in the image.

However, tuples violated this principle. When a tuple class is defined with TUPLE:, the object system would give you a default constructor. You could then override this constructor with C:. (For details, see the Factor 0.89 tuples documentation.) Consider if source file A.factor defines a tuple class, and source file B.factor defines the constructor. Now, if we load A.factor followed by B.factor, then we make some changes to A.factor and reload it, but we do not reload B.factor, what happens is that when A.factor is reloaded, the custom constructor previously defined is blown away and replaced by the default one.

This can even be a problem with the tuple and the constructor are in the same source file; for example, in Factor 0.89, reloading sequences.factor in a running image would crash Factor, because in the middle of loading it, the slice class would be re-defined. While a custom constructor follows the definition immediately, the parser relies on slices internally, and before advancing on the next line and parsing the definition custom constructor, it would invoke the default constructor, which would result in incorrect behavior.

So, I've come to the conclusion that the traditional style of tuple constructor definitions is wrong. Now, TUPLE: doesn't give you a default constructor at all. If you want a default constructor which just fills in slot values from the stack, use the new form of C: which just takes a constructor word name and a tuple class name:
TUPLE: color red green blue ;

C: <color> color

Unlike before, the constructor word can have any name:
C: <rgba> color

Custom constructors are now defined as ordinary words. Indeed, the following two definitions are equivalent:
C: <rgba> color

: <rgba> color construct-boa

Here construct-boa is a low-level word taking a class; BOA denotes "by order of arguments", and "BOA constructor" is a pun on "Boa Constrictor", one of the more amusing pieces of terminology in use by the Lisp community.

The construct-boa word is more flexible than the old hard-coded constructor. For example, suppose you have a tuple:
TUPLE: action name counter ;

We wish for the default constructor to take a name from the stack, and store zero in the other slot. Previously, you'd write something like:
C: action ( name -- action )
[ set-action-name ] keep
0 over set-action-counter ;

While this isn't so bad, other non-default constructors would get complicated quickly, even if all they did was fill in slots from the stack and with default values. Now, you can just write:
: <action> ( name -- action )
0 action construct-boa ;

In addition to construct-boa, we have construct-empty:
TUPLE: person name address employer ;

: <empty-person> person construct-empty ;

<empty-person> .
T{ person f f f f }

Finally, there is construct, which takes a sequence of slot setter words and fills these in from the stack:
: <homeless-person> ( name -- person )
{ set-person-name } person construct ;

Previously, the only way to have multiple constructors for a tuple class was to define a general constructor with C:, which possibly took many inputs from the stack and required complex stack shuffling. Now, you can define multiple constructors, with arbitrary names, which do not necessarily share code.

Even more interestingly, you can write constructors parametrized on the class name; it doesn't have to be a literal value! For example, suppose you have a class of errors:
TUPLE: reactor-error reason ;

TUPLE: nuclear-meltdown severity ;

C: <nuclear-meltdown> nuclear-meltdown

We can write a word which throws a reactor error, given a reason:
: throw-reactor-error ( reason -- )
reactor-error construct-boa throw ;

However, we might find that this is poorly factored; if we have a bunch of reasons, such as terrorist-attack, nuclear-meltdown, software-error, etc, we'd have to define constructors for each (even just using C:), then construct an instance before passing it to reactor-error. Instead, we can settle on a standard construction strategy for reasons; for example, construct-boa, and perform construction in the throw-reactor-error word:
: throw-reactor-error ( ... reason-class -- )
construct-boa reactor-error construct-boa throw ;

Now, we can write:
"very severe" nuclear-meltdown throw-reactor-error

In this case, we don't reduce complexity by much; all we eliminated is the C: <nuclear-meltdown> nuclear-meltdown. However, in more complex cases involving delegation, being able to factor out construction logic in this way has been a real boon; I've been able to clean up and simplify a lot of constructors in the UI this way. Finally, there is one less concept to learn; the special C: syntax has been replaced by three ordinary words.

No comments: