Thursday, March 20, 2008

Some improvements to locals: let* and methods with named parameters

The other day, Eduardo Cavazos remarked that working on Factor's locals library must be bittersweet. On one hand, we have this awesome demonstration of Factor's unmatched meta-programming capability: efficient, lexically scoped closures, integrated with Factor's syntax, implemented as a library of only 406 lines of code. On the other hand, very few libraries use it: I developed locals to provide an "escape hatch" from the world of stack code, but it turns out the only cases where it is necessary is for complex math formulas and certain hairy algorithms dealing with two-dimensional arrays and such. Maybe one in a hundred words stand to benefit from locals, and everything else can be expressed in a clean way using stack combinators. If locals were used more frequently, it would be great on one hand because this awesome library would be seeing some real usage. But on the other hand, if you had to resort to locals all the time, it would point to real weaknesses in the stack model.

Having said that, I still enjoy improving the locals library. I added two features recently that Doug Coleman requested.

The first feature is a [let* word. It is exactly like let* in Lisp and Scheme: you define a set of bindings and unlike [let, each binding can see the previous bindings.

Here is an example from the calendar library:
:: julian-day-number>date ( n -- year month day )
#! Inverse of julian-day-number
[let* | a [ n 32044 + ]
b [ 4 a * 3 + 146097 /i ]
c [ a 146097 b * 4 /i - ]
d [ 4 c * 3 + 1461 /i ]
e [ c 1461 d * 4 /i - ]
m [ 5 e * 2 + 153 /i ] |
100 b * d + 4800 -
m 10 /i + m 3 +
12 m 10 /i * -
e 153 m * 2 + 5 /i - 1+
] ;

This is a hairy math formula for computing the year/month/day from the number of days since an epoch. Trying to do this with nested [lets would quickly get out of hand because each binding refers to the previous one. The version using dynamic variables doesn't suffer from deep nesting, but it is inefficient and harder to read because everything is bunched up with no clear demarkation of computations:
: julian-day-number>date ( n -- year month day )
#! Inverse of julian-day-number
32044 + a set
4 a get * 3 + 146097 /i b set
a get 146097 b get * 4 /i - c set
4 c get * 3 + 1461 /i d set
c get 1461 d get * 4 /i - e set
5 e get * 2 + 153 /i m set
100 b get * d get + 4800 -
m get 10 /i + m get 3 +
12 m get 10 /i * -
e get 153 m get * 2 + 5 /i - 1+
] with-scope ;

The nice thing about [let* is that after some refactoring, I was able to implement it with very little effort. Basically the only difference between it and [let is the parsing; the closure conversion stage is identical for the two, both expand into nested lambdas.

The next feature is method definitions with locals. You used to be able to define new words with named parameters, as in the above example, but there was no corresponding way to define a method with named parameters. I added a M:: word which does this. So now locals have named-parameter analogues of :, MACRO: and M:. This represents a combinatorial explosion of sorts: for every defining word you want to have named parameters for, you need to define a new version of that word. To alleviate this somewhat, I've redesigned some code in the parser and macro library so that everything is factored in a nice, logical way:
: (:) define ; parsing
: :: (::) define ; parsing

: MACRO: (:) define-macro ; parsing
: MACRO:: (::) define-macro ; parsing

: M: (M:) define ; parsing
: M:: (M::) define ; parsing

The parenthetical helper words are defined like so:
: (:) CREATE-WORD parse-definition ;
: (::) CREATE-WORD parse-locals-definition;

: (M:) CREATE-METHOD parse-definition ;
: (M::) CREATE-METHOD parse-locals-definition ;

This is what I like to call "perfect factoring".

Locals are meta-programmable and so they would be nice if someone decided to implement an efficient Scheme to Factor compiler; it seems like most of the hard work is already done. But who the hell wants to code Scheme anyway.


Sam said...

I guess it would be more work for the parser, but wouldn't it be possible to detect when "[let" (or "[let*") is used, check that it is the first word in the definition and retrieve the stored stack effect and handle the current ":" as "::"?

The user wouldn't need to use a different defining word depending on whether they want to use locals or not.

Slava Pestov said...

Sam: the semantics of : and :: are different.

:: defines a word with named parameters; the input values do not remain on the stack.

you can use [let with : and you can also use :: without ever having a [let inside it.

Locals are documented so I invite you to get the latest Factor from git and read the documentation about their usage.