Tuple reshaping and inheritance
The best way to demonstrate the feature is with live code examples. Note that they work with Factor from git; in fact, inheritance is mostly in place, the only thing that needs to be updated is method dispatch -- of course inheritance is useless without methods, but with all the other changes in place this remaining task is comparatively easy.
First a datatype:
TUPLE: bytes n ;
: bytes \ bytes construct-boa ;
: megabytes 20 shift bytes ;
: gigabytes 30 shift bytes ;
Let's define a class to represent computers; persumably, this is a high-end, million-dollar enterprise systems management app:
( scratchpad ) TUPLE: computer ram cpu ;
Now a subclass for servers:
( scratchpad ) TUPLE: server < computer ;
A constructor:
( scratchpad ) C: <server> server
This is the 21st century and we have some pretty high-end gear:
64 megabytes "Pentium II" <server> server set
Let's check out our server in the inspector:
( scratchpad ) server get describe
server instance
"delegate" f
"ram" T{ bytes f 67108864 }
"cpu" "Pentium II"
Is our server a computer?
( scratchpad ) server get computer?
t
Oh crap, we forgot to define a slot for the disk. No problem, let's redefine the server type:
TUPLE: server < computer disk ;
What happened to our server?
( scratchpad ) server get describe
sserver instance
"delegate" f
"ram" T{ bytes f 67108864 }
"cpu" "Pentium II"
"disk" f
Notice the new slot that appeared. Let's give it a value:
( scratchpad ) server get 2 gigabytes >>disk drop
But hey, the disk slot should really go in the computer type. All computers have disks, not just servers. That's not a problem:
( scratchpad ) TUPLE: computer disk cpu ram ; TUPLE: server < computer ;
What happened to our server?
( scratchpad ) server get describe
server instance
"delegate" f
"disk" T{ bytes f 2147483648 }
"ram" T{ bytes f 67108864 }
"cpu" "Pentium II"
Now we want to add a new type for electronic equipment.
( scratchpad ) TUPLE: electronic-device voltage ;
Is our server an electronic device?
( scratchpad ) server get electronic-device?
f
Well, no it isn't, there's not type relation here. But we can redefine our computer type to inherit from the electronic device:
TUPLE: computer < electronic-device disk cpu ram ;
Tada! The object is now an instance of a type it wasn't an instance of before:
( scratchpad ) server get electronic-device?
t
Let's take one last look at that server:
( scratchpad ) server get describe
server instance
"delegate" f
"voltage" f
"disk" T{ bytes f 2147483648 }
"ram" T{ bytes f 67108864 }
"cpu" "Pentium II"
Now we could go on and assign it a voltage, then redefine a class again permuting the slots, etc. Of course people who've been following Factor for a while take this type of thing for granted, but let's take a step back and think about what happened here: we created an instance of a class, then proceeded to redefine the class, change its position in the inheritance hierarchy, and move slots up and down the hierarchy. The instance remained consistent with the current definition of the class at each step and Factor tried its best to preserve slot values when things were being moved around.
This is of course trivial in languages which use hashtables to represent objects but Factor uses a more efficient representation: objects really are just tuples, with slot values stored contiguously in memory. You get very fast slot access because of this, but the garbage collector and object system work together to give the illusion that these objects are a lot more malleable than they really are.
Cleave and spread combinators
The next thing I want to touch on is the fact that the tuple reshaping implementation (and indeed, the entire tuple class implementation in general) is a lot cleaner than the old code, primarily because I've been using the new "cleave" and "spread" family of combinators designed by Eduardo Cavazos. These combinators are in the kernel vocabulary now, and they're documented; enter
"kernel" about
in the UI listener to learn more. I won't repeat the documentation here, just give an overview.These combinators help structure your data flow by grouping parts of computations into one-to-many, many-to-many and many-to-one designs. I've never seen a programming language make this explicit before and it seems to be a genuinely new idea now present in either applicative or stack-based languages. (More academically-minded folk might wish to ponder the implications of this approach when combined with linear type systems.)
The basic picture is that we have three groups of combinators:
- The "cleave" family take m values and n quotations. They apply each quotation to all m values in turn.
- The "spread" family take m*n values and n quotations. They apply each quotation to the corresponding group of m values in turn.
- The "apply" family take m*n values and 1 quotation. They apply the quotation to each m-element group of values in turn.
The naming convention is simple. The suffix on the combinator name determines which family it belongs to. No suffix denotes a "cleave", a
*
suffix denotes a "spread", and a @
suffix denotes an "apply". If the combinator name does not start with a digit, m is 1. Otherwise, the first digit is the value of m. Finally, the rest of the name, except for the prefix and suffix, determines n
. This name is either bi
, for n
=2 or tri
, for n
=3.Here is the full list:
bi
: 1 value, 2 quotations, quotation has arity 1tri
: 1 value, 3 quotations, quotation has arity 12bi
: 2 values, 2 quotations, quotation has arity 22tri
: 2 values, 3 quotations, quotation has arity 23bi
: 3 values, 2 quotations, quotation has arity 33tri
: 3 values, 3 quotations, quotation has arity 3bi*
- 2 values, 2 quotations, quotation has arity 1tri*
- 3 values, 3 quotations, quotation has arity 12bi*
- 4 values, 2 quotations, quotation has arity 2bi@
- 2 values, 1 quotation, quotation has arity 1tri@
- 3 values, 1 quotation, quotation has arity 12bi@
- 4 values, 1 quotation, quotation has arity 2
Note that
2tri*
, 3bi*
and 3tri*
do not exist: if they did, they would take 6, 6 and 9 input values, respectively, and if your code needs that many values on the stack, then you should either use sequences, assocs, tuples, or variables.Finally, there is a generalized versions of
bi
and bi*
: cleave
and spread
take a sequence of quotations.Because the naming convention is simple and consistent, the names of these combinators are easier to remember than the names of the stack shuffles. More interestingly, I've found that after refactoring code to use the cleave combinators, the only shuffle words I really need are
dup
, drop
, and very rarely, a swap
.For example, I recently converted our serialization library to use these combinators. The result is extremely straightforward: there is almost no stack manipulation, except for the occasional dup and drop, but it doesn't use named locals either. Thanks to the cleave combinators, all values just happen to be in the right place at the right time, and everything composes together very nicely, like lego bricks.
Soapbox
Factor has changed greatly since its inception. Recent language changes such as new slots and the addition of the cleave combinators, as well changes I'm working on right now, such as inheritance and multimethods, should be the last major changes before Factor 1.0.
Last year, Daniel Ehrenberg published a roadmap for Factor 1.0. It seems that we're on track at this stage. After the language features are done, I'm going to review the entire core library and incorporate new idioms and features where possible. To put this in context, I don't think the Java implementation is far from exemplary Java code, the SBCL source code far from exemplary Common Lisp code, and Squeak's core is far from exemplary Smalltalk code. I don't want Factor to follow this trend and release 1.0 with a huge, bloated, incomprehensible implementation: I want the code to be simple, clean, clear and consistent. With the new language features and the new idioms I've been learning along the way, this goal should be achievable; in fact the core is for the most part pretty well-designed, except for some dark corners here and there.
Because Factor is a new language, implementing Factor is more work than implementing, say, Scheme or Common Lisp: when you're working against a fixed spec, for a language that has had a wealth of programs written in it, representing decades of experience, there isn't so much need for experimentation, trying different approaches. Factor is different. Factor is a new language, and the job of implementing Factor isn't a race to a certain finish-line that we can see from the start, but rather a mind-expanding journey, full of learning opportunities, joy, frustrations, obstacles and wrong turns.
On the flip-side, I can change the language to suit the implementation; if a feature is ugly to implement and makes the system more fragile, then it is better not to include it, and I have that freedom.
Once inheritance is done, I'm going to return to working on the web framework, and hopefully you will see more practical blog posts from me again, instead of philosophical junk!
4 comments:
New combinators is simply "Wow!"
I'm currently developing DNS library, using latest stable Factor. Serialisation of complex tuples into message was pain in the ass because of hevvy stack shuffling. So I used someting like "[ call ] 2each" on arrays of quotations. With new combinators it should be much simplier.
Max.
PS Excuse me for bad english. It is not my primary language, and I had very little practice during last years.
Max.
Hi Max,
Your English is great. I'm very interested in your DNS library; you should go to the IRC channel or post to the mailing list.
Nice article Slava. Always good to hear where we are with Factor and what features are being worked on.
Post a Comment