Note: the code described here is not in darcs yet, but will make it there within the new few days after I fix a few residual bugs.
Ever since quotations became a first-class data type, Factor has had a
curry
word:
3 [ + ] curry .
[ 3 + ]
This is essentially a partial function application. (I'm aware that currying and function application is not the same, but this word was named before I was aware of that fact, and anyway,
curry
reads better than
papply
or
partial
.)
Previously, this was an expensive word to use. Not only did it allocate a new quotation and copy the existing quotation's elements, but words which called quotations produced by
curry
did not compile, and had to run in the interpreter; this is due to the compiler's design, which "lifts" all quotations up to their call site.
Now, this has all changed. For interpreted code,
curry
now runs in constant time, not linear time proportional to the quotation's length, because it allocates a small object holding the object together with the quotation. These objects are instances of a
curry
class, but they print like quotations and are equal to quotations having the same elements, so they're essentially identical:
3 [ + ] curry [ 3 + ] = .
t
When a curried quotation is called by
call
in the interpreter, the VM simply pushes the curried object first, then calls the quotation.
In compiled code, we can do even better now. The compiler knows about
curry
and applies a rewrite rule to the dataflow graph. Essentially, it replaces all occurrences of
curry
's output value with a pair of values, being the object and quotation; it also "widens" all stack shuffling from the point where the curry is created to where it is called. If the curry is passed to an ordinary word, the compiler inserts a real call to
curry
which allocates memory; however, the key idea behind this rewrite rule is that curry instantiation is "lazy", and is not done at all if the curry is simply passed downward to a combinator.
For example, consider this example:
: foo ( x y -- x' y' ) 3 [ + ] curry 2apply ;
This word adds 3 to the top two stack elements. The compiler first inlines the
2apply
word:
: foo ( x y -- x' y' ) 3 [ + ] curry tuck >r >r call r> r> call ;
Now it applies the rewrite rule:
: foo ( x y -- x' y' ) 3 [ + ] abc--bcabc >r >r >r call r> r> r> call ;
The compiler represents stack shuffles in symbolic form, internally; I'm writing
abc--bcabc
to mean a shuffle with that effect, for which no word exists in the library.
Now, it lifts the quotation up to the call site:
: foo ( x y -- x' y' ) 3 [ + ] abc--bcabc >r >r >r drop + r> r> r> drop + ;
Finally, it removes the literal value
[ + ]
, since it is now dead; it travels around the stack without ever being used for anything, and is dropped:
: foo ( x y -- x' y' ) 3 ab--bab >r >r + r> r> + ;
So in this case, the compiler will produce the exact same code for the following two snippets:
3 [ + ] curry 2apply
3 tuck >r >r + r> r> +
While
[ + ] curry 2apply
is not a useful thing to write, since the direct expansion is simpler, it does demonstrate how the compiler is able to remove the object allocation from this code.
So,
curry
is efficient now, both in the interpreter and compiler. But what does this mean?
The role played by
curry
here is somewhat like a Lisp lambda which closes over one free variable. We can use
curry
to package up a bunch of values, pass them to the combinator, and fiddle with them from inside a quotation. Until now, you either had to pay a performance penalty for using
curry
, or you would use stack juggling tricks. For example, consider the
each
and
2each
combinators. Here is the current implementation:
: (each) ( seq quot i -- seq quot i )
[ rot nth-unsafe swap call ] 3keep ; inline
: each ( seq quot -- )
over length [ (each) ] repeat 2drop ; inline
: (2each) ( quot seq seq i -- quot seq seq i )
[ 2nth-unsafe rot dup slip ] 3keep ; inline
: 2each ( seq1 seq2 quot -- )
-rot 2dup min-length [ (2each) ] repeat 3drop ; inline
Here is an implementation using
curry
:
: (each) ( seq quot -- n quot' )
>r [ length ] keep r>
[ >r nth-unsafe r> call ] 2curry ; inline
: each ( seq quot -- )
(each) each-integer ; inline
: (2each) ( seq1 seq2 quot -- n quot' )
>r [ min-length ] 2keep r>
[ >r 2nth-unsafe r> call ] 3curry ; inline
: 2each ( seq1 seq2 quot -- )
(2each) each-integer ; inline
Note that
each-integer
is the new version of
repeat
, with a slightly altered stack effect.
In the old code, the quotation passed to
repeat
has to manually save and restore various values that it needs; now that
curry
is efficient, we just package those values up with the quotation before passing it along to
each-integer
.
It is somewhat ironic that while Factor is as "concatenative" language, until now concatenating quotations together was discouraged, at least where performance was important! But now, we can easily write an efficient
compose
word which takes a pair of quotations:
: compose [ >r call r> call ] 2curry ; inline
For example,
[ 2 2 ] [ + ] compose .
[ [ 2 2 ] [ + ] >r call r> call ]
It is pretty clear that this quotation has the same effect when called as
[ 2 2 + ]
. The difference between
compose
, which only works on quotations and produces a funny result, and
append
, which works on all sequences, is that since
compose
is built from
curry
, it runs in O(1) time, and the compiler is able to eliminate its runtime cost altogether.
For example, consider the
assoc-each
combinator. It is implemented in terms of the
assoc-find
combinator, since
assoc-find
is the only primitive combinator each assoc instance must implement. On each iteration, we want to call the quotation, but always output
f
, forcing
assoc-find
to continue the iteration until the end. Formerly,
assoc-each
was implemented as follows. First, we had a utility word,
assoc-find-with
, which called
assoc-find
while retaining a parameter on the stack. It was in implemented using a
assoc-with
word which was also used for
assoc-each-with
, and so on:
: assoc-with 2swap [ >r -rot r> call ] 2keep ; inline
: assoc-find-with ( obj assoc quot -- key value ? )
swap [ assoc-with rot ] assoc-find
>r >r 2nip r> r> ; inline
: assoc-each ( assoc quot -- )
swap [ rot call f ] assoc-find-with 3drop ; inline
This is ugly as hell, and essentially boilerplate. Here is the new way:
: assoc-each ( assoc quot -- )
[ f ] compose assoc-find 3drop ; inline
That's it; no need to define
-with
words, and no unnecessary stack shuffling. We exploit the "concatenative" nature of Factor here -- composition of sequences is the same as composition of functions!
Which brings me to the next topic; all the "manually curried" combinators you know and love, such as
each-with
,
map-with
, and so forth, have been moved to a
deprecated
vocabulary. They're going away soon. Updating code is easy.
swap [ swap XYZ ] each-with === [ XYZ ] curry each
[ XYZ ] each-with === [ XYZ ] curry* each
Same for
map-with
,
subset-with
, and so on. The
curry*
word is just partial application on the second stack element; it is defined in terms of
curry
.
All the sequence and assoc combinators have been greatly simplified. In fact, I was able to move the
2swap
word out of the core and into
extra/shuffle
; stack shuffling has been simplified by
curry
to such an extent that in particular,
2swap
is simply not needed anymore. Other more complex shuffle words such as
roll
and
pick
are used less frequently now, as well.
By lowering the cost of an abstraction, I was able to simplify code by a non-trivial amount. There's a general theme here: the goal of a compiler is to make idiomatic code run fast. It is not good enough to only accelerate hand-unrolled, tuned code chock-full of irrelevant detail and declarations. Nobody wants to write unnecessarily complex code just to please the compiler.