Monday, May 29, 2006

Redesigned Cocoa binding reduces image size by 2Mb

The old Cocoa binding generated one Factor word for each Objective C method. With only a dozen or so classes imported, this added 2 megabytes to the image size. The new binding instead stores a global mapping of selector names to return/argument types, and only compiles distinct words for each possible return/arg type combination. This reduces the number of stub words from several thousand to 22!

The new syntax is a bit more verbose, however it no longer requires importing a vocabulary for each class you intend to use. Instead, you use this syntax:
NSObject -> alloc -> init

This is in fact equivalent to the following, since -> is a parsing word:
NSOject "alloc" send "init" send

Calling methods in a superclass is done in a similar way:
SUPER-> dealloc
"dealloc" send-super

The compiler transforms the calls to send; in fact, the lookup of the selector is done at compile time, and the compiler turns the above snippet into something like this:
T{ selector f "alloc" f } selector G:78604
T{ selector f "init" f } selector G:78604

The selector object caches the Objective C selector (a sort of internalized string, analogous to a Lisp symbol). The gensym is the cached message sender which only depends on the arglist. Here is a typical definition:
"id" f "objc_msgSend" { "id" "SEL" } alien-invoke

So the new binding style uses less space in the image, and is no slower since in the end the same code is generated, except with less redundancy and duplication. There is one disadvantage, though: if two imported methods have the same selector but different argument or return types, they will clash and you will not be able to send one of the two methods using send or send-super. There are various ways around this; either make method dispatch slower and look up arguments at runtime (this also makes it harder, but not impossible, to make it compile) or provide an alternative form of send, perhaps send*, which takes as class name and can be used in the case of clashes.

SqueakNOS project

It looks like the effort to make Squeak into an operating system has started again. Check out SqueakNOS; they have a bootable Live CD for x86 with VESA graphics, and mouse and keyboard input.

Saturday, May 27, 2006

Window configuration (and content) now saved in the image

I implemented a useful new feature. When you start the UI it no longer resets all state, instead it tries to pick up where it left off. This was easy to implement. First, I made it so that window positions are stored in the 'world-loc' slot of the world tuple associated to the window. Next, I made the UI startup routine test if the windows hash is non-empty; and if so, it would iterate over the hashtable's values, and open windows with the gadgets stored therein.

This doesn't work perfectly yet; for example, the stacking order of the windows is not saved. However this is easy to fix.

Adding a feature like this would be difficult to a language which was not image-based. The key thing is that everything is saved. You can open a browser window, resize it, scroll the panes, open a listener, push some values on the stack, save the image, close Factor, re-open it, and your values are still on the stack and everything is as you left it. You can even cause a parse error in the listener, save the image, re-open the image, and invoke a restart as if nothing had happened!

Of course, I/O streams, aliens and other objects will expire, so its not quite as flawless as I described above; but in most cases, it Just Works.

Wednesday, May 24, 2006

Linspire Linux distribution standardizes on Haskell

From Chris Double's web log: the Linspire Linux distro is standardizing on Haskell. It would be interesting to see a well-designed language used in place of the mess of Perl/Python/bash scripts that infest a Linux distro.

Cocoa services support

I've implemented Cocoa services support for Mac OS X. If you drop your into an Applications directory, then next time you log in, there will be a Factor submenu in the Services menu in each application.

Screenshots demonstrate this feature best.

So we launch TextEdit, and type a short Factor expression...

Then we invoke the "Evaluate Selection" service...

And the result is pasted right back into the TextEdit window:

One problem is that if Factor is not running, invoking the service does not launch it. I'm not sure how to fix this; I guess LaunchServices cannot locate the for some reason.

I also implemented support for open events, so dropping a file from the Finder onto the Factor dock icon will open a listener window and run the file.

Better handling of undefined words; restarts

Everybody who tries Factor has run into this situation:
  \ layout see
An unhandled error was caught:

Parsing <interactive>:1
\ layout see
"Not a number"

:s :r :c show stacks at time of error
:get ( var -- value ) accesses variables at time of error
:error starts the inspector with the error
:cc starts the inspector with the error continuation

Oops! You forgot to use the right vocabulary -- if you know which one, otherwise its another call to apropos:
  USE: gadgets-layouts
\ layout see
IN: gadgets-layouts : layout
dup gadget-relayout? [
f over set-gadget-relayout? dup layout* dup
] when drop ;

This is somewhat irritating, and the error message ("Not a number") is not helpful at all. It arises because of how the Factor parser works; first it looks up each token in the vocabulary search path, and if it does not find a word named by this token, it attempts to parse it as a number.

Now after my latest changes, this situation is handled more gracefully:
  \ layout see
An unhandled error was caught:

Parsing :1
\ layout see
"No word named layout found in current vocabulary search path"

The following restarts are available:
0 :res Use the word IN: gadgets-layouts : layout
:s :r :c show stacks at time of error
:get ( var -- value ) accesses variables at time of error
:error starts the inspector with the error
:cc starts the inspector with the error continuation
0 :res
IN: gadgets-layouts : layout
dup gadget-relayout? [
f over set-gadget-relayout? dup layout* dup
] when drop ;

After the restart is invoked, the relevant vocabulary is automatically added to the search path, and parsing continues. If more than one word with the given name is defined in the dictionary, a number of restarts are offered, one for each possible vocabulary.

Here is how it works behind the scenes:

The condition ( error restarts -- restart ) word signals a recoverable error. The error parameter is the actual error message; it is wrapped inside a special condition object, together with the restarts parameter (more on it in a second) and the current continuation. The restarts parameter is an association list, associating a string description of a restart to an object. If the user chooses to restart from this error, the object associated with the chosen restart will be pushed on the stack, and execution will resume after the point where condition was called.

Here is an example:
: restart-test
"This is broken" {
{ "Indeed" t }
{ "I disagree" f }
} condition
"You choose " write . ;

Running this example will yield the following error:

An unhandled error was caught:

"This is broken"

The following restarts are available:
0 :res Indeed
1 :res I disagree
:s :r :c show stacks at time of error
:get ( var -- value ) accesses variables at time of error
:error starts the inspector with the error
:cc starts the inspector with the error continuation

Invoking either restart using 0 :res or 1 :res will rewind execution and push a boolean true or false on the stack. The portion of the word definition after the call to condition now executes; it prints the received boolean on the stack.
  0 :res
You choose t

The problem with this approach is that any exception handlers further up the chain might have already closed streams and other external resources before the restart is invoked. In this case the restart will not function properly. What CL does is not execute any clean up handlers before the user picks a restart; this involves two trips up the catch stack, one to collect restarts and possibly present a debugger, and another one to call the relevant cleanup hooks, if any, depending on what restart is chosen. I'll investigate this further and pick a good solution. In any case I don't intend to use this feature as extensively as CL uses restarts, so I hope to avoid hairy situations.

Saturday, May 20, 2006

Some fun with the UI

I implemented two new useful tools in the UI. First up is an apropos tool:

Next up, a multi-pane browser, inspired by the Whisker browser for Squeak; the idea here is that in each column, you can click on several items which are all shown at once in the next column:

Both tools are in the DARCS repository and will be part of Factor 0.83.

If you are interested, here is the source code:

Thursday, May 18, 2006

Quotations now a first-class type; conses removed

I finished pushing a large set of patches to the main repository which implement the changeover I've been talking about for a while now.

What follows is a rundown of what has changed.

First of all, literals like [ 1 2 3 ] are now instances of a quotation type. They implement the sequence protocol and behave much like arrays. Any sequence can be converted to a quotation via >quotation. The difference between a quotation and an array is that like the list-based "quotations" of old, these objects can be passed to the call and if primitives which are the basic building blocks of all higher order functions in Factor.

It used to be that f and [ ] both parsed to the same object; this is now no longer the case. The object [ ] is a length-zero quotation.

Parsing words will need to be changed. To begin a new nested parse level, push f not [ ]. To add to the parse tree, use parsed instead of swons. The end result is a vector, not a list in reverse order. Replace reverse calls with >quotation or similar. If this terse description made no sense, it will be more detailed in the documentation.

A lot of code did not use the list-specific words, but it used list literals to hold data. Using [ ] literals for non-code data is still okay. I'd prefer if people used arrays { } for that purpose, it just makes the code a tiny bit more clear.

Tuesday, May 16, 2006

Factor 0.82 now available

Factor 0.82 is now available. This release brings an overhauled compiler and various other improvements.

Thursday, May 11, 2006

Factor 0.82 almost ready

I'm tying up some loose ends in the register allocator, and updating the AMD64 compiler backend. Factor 0.82 should be out in a few days. Apart from the compiler overhaul, this release only brings a few minor improvements and bug fixes.

My plans for 0.83 include:
  • Major improvements to the UI
  • Compiler internals documentation
  • Possibly the new array quotation interpreter, and phasing out cons cells
  • Getting Factor running on Mac Intel

Going back and forth between UI and compiler work for each release seems to be the pattern for the last few releases. In 0.84, I will probably once again work on the compiler, and implement complex float intrinsics using SSE2 and AltiVec, as well as some new dataflow and template optimizations I've been planning out.

Tuesday, May 09, 2006

Cool Common Lisp project

It appears as if more and more people are building really neat applications using McCLIM, such as cl-wav-synth. Not bad for an interpreted AI language from the 50s only used by academics whose sole datatype is the list, as some seem to think anyway :-)

Floating point intrinsics working on PowerPC

The compiler now inlines machine code for floating point operations instead of calling the runtime. This results in a performance improvement on my 2.5 Ghz PowerPC G5:

Here are the x86 results (Pentium 4 1.8 GHz):

The Before column is 0.81, the After column is 0.82, and times are in seconds.
Yes, a five year old box beats my brand new G5, but I suspect this is because my compiler doesn't do instruction scheduling.
Bigger gains will come as the higher levels of the compiler improve, resulting in more inlining.

Thursday, May 04, 2006

Java 6 last release to support PowerPC macs?

So while I've been considering dropping support for x86 CPUs older than 5 years (and I decided I won't do it, and just offer two boot images for x86 instead), Apple released a Java 6 beta for Intel macs a few days ago, indicating it will follow suit with a PowerPC version soon. However the word is that this will be the last release to support PowerPC macs. Just goes to show that "write once run anywhere(tm)(R)(C)" is a marketing myth, and indeed Java is one of the least portable languages out there.

Wednesday, May 03, 2006

Is SSE2 mainstream?

The SSE2 extensions to the x86 architecture debuted when the Pentium 4 was released. SSE is a vector processing extension which only dealt with single-precision floats. SSE2 supports double-precision floats. What this means is that as of the Pentium 4, the x86 finally has a sane FPU architecture (the x87 is horrible to program for; it is stack-based, but only has a depth of 7 cells so you get the disadvantages of registers together with the disadvantages of a stack).

What I'm wondering is how widespread Pentium 4 (and above) chips are, and if the x86 backend of the Factor compiler should require SSE2 for floating point operations.

If you have thoughts, share them in the comments. Most likely, nobody will respond, and I will go ahead with my evil plan to drop support for Pentium III chips and older :-)

Monday, May 01, 2006

Assembler templates

I've finished updating the PowerPC and x86 backends for the new compiler design, and new boot images are up. AMD64 will have to wait a few days.

If you study the compiler source, you will find many calls to the words with-template and define-intrinsic; the former is called during code generation time to load inputs from the stack to registers, and store registers back to the stack. The latter associates a word with a compiler intrinsic quotation which passes the specified arguments to with-template.

Here is a typical definition:
: generate-fixnum-mod
#! PowerPC doesn't have a MOD instruction; so we compute
#! x-(x/y)*y. Puts the result in "s" operand.
"s" operand "r" operand "y" operand MULLW
"s" operand "s" operand "x" operand SUBF ;

\ fixnum-mod [
! divide x by y, store result in x
"r" operand "x" operand "y" operand DIVW
] H{
{ +input { { f "x" } { f "y" } } }
{ +scratch { { f "r" } { f "s" } } }
{ +output { "s" } }
} define-intrinsic

This looks like a macro assembler; the operands "x", "y", "r" and "s" are assigned at compile time, and used to generate an assembly segment involving various instructions, such as MULLW and SUBF.

The complete list of specifiers which can be passed as keys in the hashtable there is as follows:
  • +input - an array of pairs, where each pair is of the form { vreg string }; the string is an operand name, and the vreg is either an integer or f (meaning any free register can be assigned). The with-template form compiles code to load objects from the stack into the input registers
  • +scratch - this specifier is of the same format as +input, except here the registers are untouched. The body of the template can use them for any purpose, including outputs.
  • +output - a list of previously-assigned operand names holding values which should be moved to the stack. Registers allocated by +input and +scratch can be listed here.
  • +scratch - a list of previously-assigned operand names whose contents should not be assumed to have remained constant after the template finishes executing. Usually a template does not modify its inputs; however if it does, they must be listed here to avoid miscompiling code.

I'm going to implement a few assorted cleanups and optimizations now, and then move on to floating point intrinsics, which should provide a nice performance boost.