Wednesday, December 27, 2006

Announcing apps/help-lint

Lately I've been building up a small library of routines for automated checking of Factor's documentation. My most recent effort was the doc test clone for testing code examples. Well, today I cleaned things up, added some new inspections, and put everything together into a new apps/help-lint module. It now performs the following checks against word documentation:
  • ensures examples run and produce stated output
  • ensures $see-also elements don't contain duplicate entries (I always make this mistake!)
  • ensures $module-link elements point to modules which actually exist
  • ensures that $values match the stack effect declaration
  • ensures that word help articles render (this catches broken links, improper nesting, etc)

Some of these checks will make their way into the core of the help system over time, others will remain as stand-alone tools. now running latest darcs Factor; UI documentation

The Factor web site is now running the latest code. In particular, this means you can browse the latest Factor documentation online. This includes documentation for the UI, which is something I've been working on for the last couple of weeks. The documentation is always a work in progress, but I think the fundamentals of the UI are now well-documented and people should have a much easier time writing their own gadgets.

The previous HTTPD instance running on Factor 0.85 never crashed, and ran continuously for a month and a half. I hope 0.87 keeps it up, but who knows, the web site might be down by the time you read this!

The parse time -vs- run time duality

In Factor, anything which can be done via syntax in a source file can also be done at runtime. This lets you construct code on the fly, and define new words with the new code, just as if they were written in a source file:
TaskParsing wordOrdinary word
Defining a word : define-compound
Defining a symbol SYMBOL: define-symbol
Defining a generic wordGENERIC:define-generic
Defining a generic wordG: define-generic*
Defining a method M: define-method
Defining a tuple class TUPLE: define-tuple

Monday, December 25, 2006

Another web service example in Factor

Daniel Ehrenberg has an example of using libs/xml and libs/http-client to search the web using the Yahoo! REST API. The code is very short, I like it. Factor makes XML usable.

Saturday, December 23, 2006

Scheme 48 1.4 released

Scheme 48 is a notable Scheme implementation. The VM is implemented in a low-level Scheme-like language PreScheme which is translated to C. This is the same implementation technique as used by Squeak.

Friday, December 22, 2006

Funny quote

Seen on JavaLobby. This is simply the best thing I have ever read in my entire life! It made my day, no, it made my week! I love it. Java programmers are awesome.

"Closures are the OO euphemism for function pointers which in OO terms is a dirty word because pointer is a dirty word."

Tuesday, December 19, 2006

Google removes SOAP search API

This is sad. I understand why they do this (there's no ad revenue from web services) but it comes just as Chris Double added a Google search client to apps/. It would have been nice to use this client to implement searching for web sites written in Factor, such as Matthew's upcoming weblog app, parsing the results of the SOAP call and displaying them as HTML. But now we'll have to find another way to do that.

JazzScheme: Scheme environment and IDE for Windows

One of the members of MSLUG has open-sourced his JazzScheme project, which he has been developing for 10 years now.

Weblog software in Factor

Matthew Willis has been working on a weblog implemented in Factor. He's using the following libraries:
  • /libs/httpd/ to serve pages
  • /libs/furnace/ for action handling and templating
  • /libs/tupledb/ for O/R mapping
  • /libs/sqlite/ for persistence

Monday, December 18, 2006

Solved a conjecture

I proved Conjecture 5.9 on page 14 of Barry Jessup's paper in the case of two-step nilpotent Lie algebras. I had a proof for the case where the dimension of the center was equal to 2 a few weeks ago, and discovered a nice generalization this morning after my supervisor suggested some simplified notation to clarify the argument. The proof is nothing special; there are some tricky bookkeeping details and it runs to about 5 pages since it is a double induction, however it does not use any advanced tools from algebraic topology. I'm still proud I managed to prove it, because I've been working on this since April. Generalizing it to the case of an arbitrary nilpotent Lie algebra (or finding a counter-example) should be fun, and might require the use of spectral sequences. The current proof relies on what is essentially a spectral sequence collapsing at the first term. This is the second original result of my thesis. My previous result shows that the central representation is faithful for a certain class of two-step algebras. This latest result shows that it is non-trivial for all two-step algebras (however we know of two-steps where the central representation is not faithful).

No math

I find it interesting that relatively few Factor words in the standard library directly call arithmetic words.
  { + - * / /f /i < <= > >= shift bitand bitor bitxor bitnot neg recip }
[ usage ] map concat prune length .
all-words length .
6561 405 /f .

That is, less than 1 out of 16 words directly calls an arithmetic word.

What about APL-style vector math?
  { v+ v- v* n*v v*n v/ v. norm vneg vmax vmin }
[ usage ] map concat prune length .

Right now vector math is mostly only used in the UI, however it is used to great effect. Layout calculations become much simpler.

Many (most?) source files don't even USE: the math vocabulary.

Kind of makes the arguments about postfix arithmetic not being intuitive a moot point.

UI documentation

Documentation for the UI library is something which many people have asked for and it is long overdue. I have been working on it for the past week; so far I've managed to document most of the relevant words, and I'm now writing help articles which tie everything together. This is the last major piece of work I plan to finish before 0.87.

I implemented something else today which I was previously planning on deferring until 0.88; stack traces for errors thrown from compiled code. I had some half-working code for this some weeks ago; I dusted it off and made it more robust. It still needs testing on Windows and Linux/PPC (the set of people who use Linux/PPC and follow Factor seems to have cardinality one, right now, though).

Interestingly, the old Java implementation of Factor could produce stack traces for exceptions thrown by compiled words. This was done by wrapping every compiled word in a try/catch which would cons the word onto a list stored in a global variable in the catch handler. Then when the stack was unwound to the top-level interpreter loop, this variable would be passed on to Factor code which handles the error. CFactor does this in a much more hardcore way, by walking the native stack after the error is thrown.

Cleaned up embedded.factor

Matthew Willis reported a bug against the embedded.factor templating library in the Factor HTTP server, which I now fixed.

Recall that embedded.factor evaluates Factor code between <% and %>, and outputs the rest as text.

I wanted to post the code because I also cleaned it up and now it looks really nice. Factor is all about writing simple code to get useful things done, and I don't think many languages manage to implement a web templating solution with less complexity than Factor:
! Copyright (C) 2005 Alex Chapman
! Copyright (C) 2006 Slava Pestov
! See for BSD license.
IN: embedded
USING: sequences kernel parser namespaces io html errors ;

: process-html ( parse-tree string -- parse-tree )
dup empty? [ drop ] [ parsed \ write-html parsed ] if ;

: process-embedded ( parse-tree string -- string parse-tree )
"<%" split1 >r process-html r> "%>" split1 >r (parse) r> ;

: (parse-embedded) ( parse-tree string -- parse-tree )
dup empty?
[ drop ] [ process-embedded (parse-embedded) ] if ;

: parse-embedded ( string -- quot )
[ f swap (parse-embedded) >quotation ] with-parser ;

: eval-embedded ( string -- ) parse-embedded call ;

: run-embedded-file ( filename -- )
dup file set ! so that reload works properly
dup <file-reader> contents eval-embedded
] with-scope
] assert-depth drop ;

: embedded-convert ( infile outfile -- )
<file-writer> [ run-embedded-file ] with-stream ;

Bear in mind that this implementation properly supports mixing combinators with templating, so for example you can write the following in the middle of your template:
<% 10 [ %> <p>Hello</p> <% ] times %>

And as a final note, I'm aware that mixing markup and logic is a bad thing. However Factor encourages factoring problems into many small words, so the template should not be doing any heavy processing, and instead only calling a handful of words here and there, to achieve the same effect (but less complexity and more elegance) than JSP taglibs and such.

Wednesday, December 13, 2006

'doctest' in Factor

The doctest module for Python scans your source file for code examples in docstrings, and evaluates them to check against the correct result. The documentation for this module seems to suggest it is a rather non-trivial affair (and looking at the code confirms this). In comparison, implementing this feature in Factor takes a few minutes:

: all-word-help ( -- seq )
all-words [ word-help ] map [ ] subset ;

: all-examples ( words -- elements )
[ \ $example swap elements ] map [ empty? not ] subset ;

: example-valid? ( elements -- ? )
1 tail
[ 1 head* "\n" join eval>string "\n" ?tail drop ] keep
peek = ;

: incorrect-examples ( words -- elements )
[ [ example-valid? ] all? not ] subset ;

all-word-help incorrect-examples [ . ] each

Factor's help is written in a structured markup language where examples are explicitly denoted as such (and in the UI, you can click them to evaluate them). So all you have to do is iterate over all words in the dictionary, extract their documentation, collect $example markup elements, and ensure they do the right thing. Note that $example elements look like so:
{ $example "2 2 + ." "4" }

All but the last element of the array are strings which are joined together with newlines in between to form the expression to be tested, and the last string is the expected output.

Using this tool I actually found some typos in the Factor documentation. I think the idea is great, and while Python makes it easier than something like Java (where you'd have to write a Doclet and use BeanShell, or something), Factor makes it even easier because it has an help system with a markup language, not just docstrings.

I have some other bits of code sitting around for checking consistency of parameter names in documentation, and checking for broken links and markup. One day I'm going to put them together into a single 'lint'-style tool for documentation.

Factor module system considered harmful?

Eduardo Cavazos doesn't like the current Factor module system, and has a proposal for a replacement. In a nutshell, he wants to unify the concept of a module with a vocabulary, and remove the conceptual overhead that goes with the distinction. I agree with most of his ideas. I wrote a lengthy post to the Factor-talk mailing list in response to an e-mail from Eduardo which discusses Eduardo's module system idea, and suggests some additions/improvements which would fix even more shortcomings of the current module system. I've reproduced the post below:
Hi Eduardo,

The more I think about it the more I like your idea. It would solve some
problems I perceive with the current setup. The first one, I think, is
the one you're focusing on, but the others strike me as easy
consequences of the design:

1) Too many concepts. Vocabularies, modules, source files... USING:,
REQUIRES:, IN:, PROVIDE:. The module system feels hacked on to an
existing core which has a notion of vocabularies, and this is exactly
what happened. I implemented the module system initially as a half-hour
hack to make it easier to load libraries from contrib/ without manually
loading dependencies. It did this job well, but now we both want to use
it to /structure code/, and later on to edit code in the UI, which is
something else entirely.

2) Suppose you have a source file where you write,

IN: foo
: a ;
: b a ;

You load the source file into Factor. Then you add a new word definition
: c b ; but you add it _before_ the definition of b. The file still
loads, however, because the definition of b is present in the vocabulary
when c is parsed.

But if you start Factor with a clean image, the source file won't load
(and bottom-up ordering of definitions a good thing; in CL,
automatically interning symbols in the wrong package can be a source of


The new module system can fix the problem (which is that the modified
source file could load without an error in the first place) as follows.
You start by marking words in a module with a special flag before
reloading the module; then as each definition is read, the word's flag
is reset. If a definition refers to a word with the flag still set, a
parse-time error is raised. When the module is finished loading, the
module system can also check if any words which have the flag set are
still referenced by words in other modules; if so, a message can be
printing instructing the programmer to either reintroduce the word, or
refactor those modules, etc.

3) Suppose you have a source file X.factor where you write,

IN: x
: a ;
: b a ;

And another file Y.factor,

USING: x ;
IN: y
: c b ;

You load X.factor and Y.factor, then you decide to move the b word from
the x vocabulary to the y vocabulary. So you move the word and reload
the two files, but the definition of b in x is still in the dictionary
and this can cause confusion when testing the code, or if y has a USING:
x ; after the IN: y.


The new module system can fix this, again, by using 'obsolete' flags on
words. After loading a module, any words which are marked as being old
have not been redefined by the updated source file, and thus they can be

4) Source files are allowed to define words in arbitrary vocabularies.
By unifying the notion of a source file and a vocabulary, you completely
do away with this 'feature'.


Redefining words in other vocabularies is not a problem in itself, but
it is something I want to move away from. Right now for example you
cannot load the Cocoa and X11 UI backends at the same time, and use the
Cocoa instance while running Factory, because the two UI backends will
redefine the same DEFER:'d words.

Running the Cocoa UI and Factory at the same time is contrived use case,
but with a structure editor you want to load code before editing it, and
not being able to load certain modules is totally unacceptable. So
instead, I want to restructure the UI code, as well as I/O and compiler
backends, so instead of redefining words it uses generic words,
dispatching in a singleton tuple representing the "backend" in question.

If the module system enforced such a design, it would motivate me to do
this refactoring faster :)

Now I'm completely against a language which enforces *arbitrary* design
decisions, because it requires bending over backwards to implement
certain things; for example, something that only supports
message-passing OOP is too restrictive in my opinion. But enforcing
design decisions which demonstrably lead to code which is cleaner and
easier to write is a good thing, for example memory safety and garbage
collection. (I'm not going to get into whenever I think static typing
enforces arbitrary design decisions or improves code quality, and I
don't want any replies to this thread to turn into a dynamic -vs- static
discussion, please, since this has nothing to do with the module

5) A continuation of the theme in 1/2/3/4. I've run into this problem
from time to time. Suppose you have a file a.factor,

IN: jelly

: toast ... ;

And another file, b.factor,

IN: jelly

: toast ... ;

You forgot that a 'toast' word already exists, and redefined it with a
different meaning! Then you test some caller of 'toast' in a.factor only
to discover it doesn't work. Of course unless you decide to put the
words in a.factor and b.factor in different vocabularies, the new module
system as Eduardo describes it won't solve this issue outright, since
the naive programmer could simply concatenate a.factor and b.factor into

IN: jelly

: toast ... ;

... words using toast #1

! Oops, didn't notice this is a redefinition... : toast ... ;

... words using toast #2

However, just like referring to a word with the 'obsolete' bit still set
could be a parse time error, /redefining/ a word with the 'obsolete' bit
/not/ set could be an error too, unless the old definition was a DEFER:.

6) Documentation will be simplified. Right now we have several parallel
conceptual 'trees' in Factor;

- vocabularies contain words - modules contain source files which
contain definitions (where definitions includes words, methods, etc) -
the documentation for the core library is rooted at a 'handbook' page
which links to multiple articles, each one documenting a major subsystem
of Factor, but many such top-level articles don't really correspond to
modules and vocabularies

If modules and vocabularies were unified, I could redo the core
documentation in a more structured way, with each vocabulary/module
having a main documentation page, containing as subsections more
articles, word documentation, and so on. The handbook could still start
with an 'entry points' page with links to the most commonly needed
topics, much like the 'cookbook' now.

Note that all this could be done without drawing a distinction between
'loading' and 'reloading' from the viewpoint of the user, and even more
importantly it would not require extra bookkeeping beyond tracking
loaded modules and words (for instance, implementing the first 3 items
can be done with the current setup, by recording a list of definitions
loaded from every source file in a new data structure, but maintaining
this would be extra burden).

7) What should be done with unsafe words in -internals vocabularies is
an open question. I'm against having private words and such, because I
believe any word should be interactively testable. One way to get around
this is to have private words, but also have another parsing word
FRIEND: which is like USING: but makes private words available. But this
would complicate the implementation (and description!) of the vocabulary
search path. Instead, it might be best to simply split off -internals
vocabs into separate files and treat them like normal

Let me know what you guys think,


Monday, December 11, 2006

Compiling Factor to JavaScript

Chris Double writes about compiling a subset of Factor to JavaScript. I'd like this to evolve into something which lets me write web apps which use client side JavaScript to manipulate the DOM and communicate with the server ("AJAX"), without having to write JavaScript by hand, or write code which constructs JavaScript code by concatenating strings. Instead, I want to write the whole web app in Factor, with some code running client side and some server side. Perhaps Chris Double's concurrency library can be used to communicate between Factor on the server and JavaScript Factor too.

Blogger sucks

I know many people idolize Google (and this seems to give their employees unlimited license to blog with complete ignorance), but other than the search engine and Google maps, they don't have any good products. Everything else is typical Java web trash. Blogger seems to be suffering from more reliability problems lately, and the "publishing" process is annoying. Also I don't like the way newlines in the blog entry editor are converted to <br>'s; this makes it hard to paste HTML from another source, you have to merge it all into one line first (but there might be a way to disable this line break behavior). All in all, blogger might be great if you just want to post rants about how dynamically typed languages don't support refactoring (ahem), but it is a poor product overall. Still beats JRoller, though.

Sunday, December 10, 2006

Live music performance with Squeak

Craig Latta, lead developer of Spoon (a Squeak fork with an emphasis on running developer tools in a different image from the application being developed) uses Squeak for live music performance. Cool stuff.

Thursday, December 07, 2006

Generating a minimal image

I played around with bootstrap and the module system today, and got Factor to generate a 600kb factor.image. The image does not include the compiler and UI, however it does display a listener, and has all the math and collections words in it. If there is any interest in this, I will clean this facility up and make it easier to use. Personally I don't have much interest in making images small. I think the current size of around 6mb is fine, considering it includes full documentation and developer tools.

Division by zero hoax

James Anderson is perpetrating an elaborate hoax, claiming to have finally "solved" the issue of division by zero in the real numbers. The BBC is apparently unaware that this is a joke.

As so many others who have tried to "solve" division by zero have found out in the past, issue here, of course, is not that one can't divide by zero; you can modify the axioms of the real line and define division by zero in any way you want. The issue is that changing these axioms makes the real line no longer satisfy the axioms of a ring, and so much of algebra and topology can no longer apply. Of course one is free to invent their own ring theory and topology, but why bother? A century of research into abstract algebra will no longer hold, and it is not clear what good such a number system is.

All in all, James does a great job of parodying how out of touch with reality computer science professors really are. The paper is written in Microsoft Word, not LaTeX -- check out the unreadable notation and poor formatting, its hilarious.

Wednesday, December 06, 2006

Tuesday, December 05, 2006

Recent UI improvements

Since the dawn of time, Factor's prettyprinter would print a "#" in place of a sequence when nesting was too deep, and "&" in place of a circular reference. I have enhanced this so that these symbols are printed as presentations of the object in question. So if you print an object while nesting depth is limited, you can still descend deeper into the object's structure simply by clicking. Of course the inspector has always offered this and it is still around, but it is useful to be able to do this with prettyprinted output as well.

Another feature I implemented is proper multi-line input handling. In the UI listener, if you type the start of an incomplete expression, for example : foo, pressing Return automatically inserts a newline. What the UI listener does is attempt to parse the input, and if it does not parse completely, it simply retains the contents of the input area, instead of passing it to the evaluator. This is very nice for entering multi-line expressions.

Factor 0.86 and previous versions already supported multi-line input in the UI listener but in a more awkward way; if you pressed Return with a partial expression entered, it would go to the evaluator and disappear from the input area, at which point you could still continue entering the remainder of the expression, but you could not edit what had already been parsed. You could also explicitly insert a newline with Shift+Return. I think the new approach is far more usable, especially together with the improved parse error handling as described below.

If you get a parse error, instead of simply dumping the parse error to the output area and forcing you to pick the previous input from the history, fix it, and try again, the listener now positions the caret at the location of the parse error and displays the parse error in a popup (which idiotically obscures the input field right now, but that's going to get fixed).

Finally, if there are any restarts, for instance offering to add a vocabulary to the vocabulary search path so that a word can be found, they are presented in a keyboard-navigatable list. No more trying to type 0 :res, mistyping it as 0 ;res, and losing your restarts, forcing a trip to the history, and another attempt at entering the input! Instead you use the Up/Down arrow keys and the Return key to pick the relevant restart.

I'd post some screenshots, but the appearance is not finalized and its hard to get a feel for the interactive behavior of the new listener in screenshots. I think when 0.87 is out, I'll try my hand at making a screencast instead.

I think the UI listener is now rather sophisticated, offering better usability than most other interactive evaluators I've seen. The big hole in the functionality right now is that you cannot select and copy text from the output area. This is not trivial to implement, since the output area actually consists of gadgets, one for every piece of styled text or presentation which was printed. So to be able to select and copy I would need to implement some kind of cross-gadget text selection protocol. However McCLIM already does something like this, its not rocket science, and it shouldn't be too hard to implement.

Factor 0.87 approaching

Factor 0.85 introduced code GC, and 0.86 introduced the local roots refactoring. Both of these changes had their fair share of bugs. I think 0.87 shakes most of them out; I used a simple looping regression test:
: foo \ = reload foo ; foo

What this does is reload the source file defining the = word, which in turn causes most of the library to be recompiled. The word then calls itself, causing an infinite loop. Eventually the code heap fills up and code GC is performed.

Factor 0.86 would crash pretty quickly with this test, because of problems with local root recording. I would modify the runtime to trigger a GC after every heap allocation, which always moves objects, and thus always breaks code which does not record local roots properly. At first this would crash Factor after a few seconds, but after working these issues out, I hit a new one; Doug Coleman reported that this tests crashed Factor on Windows after a few hours. I managed to reproduce it and the cause was surprising: if the last block in the code heap was allocated and not free, the free list would be corrupted and all blocks from the last free block onwards would be marked free. Oops!

I don't claim Factor is bug-free, but I think 0.87 is a lot more stable than 0.86. Every time I make a major change to memory management, it takes a few point releases to shake the bugs out; the last time this happened was with generational GC some time in the 0.7x series. Overall I've had virtually no "Heisenbugs" which were impossible to reproduce, unlike my experience with jEdit. In fact most Factor bugs I've had fell into one of several categories:
  • Insufficient coverage of corner-cases; this describes the code GC bug above.
  • Portability breakage: stuff that works on one platform but breaks on another because some API call doesn't behave as one would expect
  • Insufficient validation: some runtime primitives and low-level words would not validate their inputs enough, and create inconsistent objects and/or cause some form of corruption when handled bad values.

Doug Coleman has continued enhancing his random compiler tester in order to catch the last type of bug. When he first started his project I was somewhat skeptical, however as the tester has grown in sophistication it has uncovered some real bugs.

One thing I have noticed is that I rarely run into runtime type errors, or stack balancing errors. Logic errors certainly account for the bulk of my bugs. However I have found that compiler-checked stack comments are nice to have. I am still considering adding mandatory static stack effects, and optional static typing, at some point in the distant future, however I don't have any concrete plans yet.

Thursday, November 30, 2006

Simple XML-RPC example - Factor talking to a Common Lisp web service

This example makes a paste to the LispPaste service; it requires the latest Factor from darcs:
IN: lisppaste
REQUIRES: libs/xml-rpc ;
USING: arrays kernel xml-rpc ;

: url "" ;

: lisppaste ( channel user title contents -- response )
4array "newpaste" swap <rpc-method> url post-rpc ;

You can use it like so:
"#concatenative" "jane_doe" "Test" "Testing testing" lisppaste

But please don't flood our channel with useless pastes. As soon as Dan fixes some bugs in the XML-RPC code, I will extend this example to have a GUI which offers a channel list.

rot13 in Factor

Daniel Ehrenberg implemented the rot13 algorithm in Factor. I added it to the demos/ directory; it is very simple and elegant:
IN: rot13
USING: kernel math sequences strings ;

: rotate ( ch base -- ch ) tuck - 13 + 26 mod + ;

: rot-letter ( ch -- ch )
{ [ dup letter? ] [ CHAR: a rotate ] }
{ [ dup LETTER? ] [ CHAR: A rotate ] }
{ [ t ] [ ] }
} cond ;

: rot13 ( string -- string ) [ rot-letter ] map ;

SBCL 1.0 released

Version 1.0 of Steel Bank Common Lisp has been finalized. Bill Clementson writes,
Well, the SBCL web site hasn't been updated yet and there's been no official announcement; however, SBCL has been tagged 1.0 in CVS, and the 1.0 tarball has been put up on SourceForge, so I guess this is it.

Congratulations to the SBCL developer team and all contributors. This is milestone in the history of Common Lisp.

Sunday, November 26, 2006

Using Factor for shell scripting

For a while now you could pass a source file to the Factor runtime to have it be run on startup. Now you can also pass a code snippet to evaluate:
./f -e='2 2 + .'
If you want Factor to exit immediately after,
./f -e='2 2 + .' -shell=none

Color picker example

I put together a short example showing how the "models" feature in the Factor UI can be used to wire together interfaces with very little code. The example displays a window with three sliders for changing the R/G/B values of the color shown in a preview pane. Also, a label displays numeric R/G/B values.

The code is in darcs, examples/color-picker.factor. Here is a walk-through:

We start with the usual vocabulary boilerplate:
IN: color-picker
USING: gadgets-sliders gadgets-labels gadgets models arrays
namespaces kernel math prettyprint sequences ;

We have a word which creates a slider with a maximum value of 255, and a scrolling increment of 1 when clicking the up/down arrows:
: <color-slider> ( -- gadget )
1 over set-slider-line
255 over set-slider-max ;

Now, a word which creates a gadget whose fill color comes from a model:
: <color-preview> ( model -- gadget )
<gadget> { 100 100 } over set-rect-dim
[ set-gadget-interior ] <control> ;

A word which creates a model that wraps an underlying model, converting R/G/B triplets from 0 to 255 to an instance of the <solid> class, which can then be passed to set-gadget-interior:
: <color-model> ( model -- model )
[ [ 256 /f ] map 1 add <solid> ] <filter> ;

A word to create three slider gadgets, and a model holding the chosen R/G/B triple:
: <color-sliders> ( -- model gadget )
<color-slider> dup , control-model
<color-slider> dup , control-model
<color-slider> dup , control-model
3array <compose>
] { } make make-pile 1 over set-pack-fill ;

Finally, a word to put everything together; the color sliders, the color-changing preview gadget, and finally a label control which displays numeric R/G/B values:
: <color-picker> ( -- gadget )
{ [ <color-sliders> ] f f @top }
{ [ dup <color-model> <color-preview> ] f f @center }
{ [ [ unparse ] <filter> <label-control> ] f f @bottom }
} make-frame ;

And here is the module boilerplate:
PROVIDE: examples/color-picker ;

MAIN: examples/color-picker
<color-picker> "Color Picker" open-titled-window ;

A screenshot:

Factor 2 Enterprise Edition

... now with support for SOA, B2B, and outsourcing.

No, seriously, Daniel Ehrenberg has implemented an XML-RPC client and server in Factor; the code is in darcs, contrib/xml-rpc. Good stuff! After I release Factor 0.87 I'm going to get back to work on the web framework, and cook something up so that web apps automatically have an XML-RPC interface. I'd also like to put together a demonstration app that uses the XML-RPC client to make pastes at Lisppaste.

Thursday, November 23, 2006

OpenFirmware is now free software

OpenFirmware, the Forth-based monitor and bootloader used on Sun workstations and (all but the earliest) PowerPC Macs, is now free software. You can browse the source code.

Back from Montréal

I had a lot of fun presenting at the MSLUG, there was a great turnout. Everybody asked lots of questions and I think they all had a great time. It was well worth the two and a half hour bus ride there and back! Sébastien Pierre has a nice writeup, if your understand French. I don't but I was able to make sense of most of what he wrote :-)

The slides I used are available in the darcs repository. Make sure you have running the latest image, and then from the UI, do:
"examples/mslug-talk" run-module
Or alternatively, press C+m, and type mslug (or any other sequence until the first completion in the popup is the correct module) then press ENTER. Use the arrow keys to navigate the slides. However I don't think they will make much sense, unless you were there, since I said a lot more than what was written in the bullet points.

Friday, November 17, 2006

Genuine enterprise thought leadership

You just can't make this stuff up. Seen here:

"JavaScript growth is flat while pardoxically AJAX is taking off. In other words, we will take the AJAX and rich applications thank you very much, but please keep your tinker-toy language."

In other news, Web 2.0 is taking off, but HTML is so 90's!

Thursday, November 16, 2006

Most people who criticize Lisp don't understand it

Disclaimer: I don't program in Common Lisp these days and I don't claim to speak for the Lisp community at all, however I think I do know the Lisp language reasonably well.

Just looking the last few weeks of postings on reddit related to Lisp, people raise the following objections:
  • Lisp does not interface with the C-based world easily, unlike scripting languages -- this is clearly false, since most Lisp implementations have easy to use FFIs. To bind a C library to Ruby, you have to write C code which glues the library to the Ruby runtime system. With CFFI (which is portable across CL implementations), you just list the function prototypes in your Lisp file and you're ready to go. You can also pass structs, etc. Factor's C library interface works the same way.

  • Common Lisp was designed to perform well on Lisp machines only, and makes no concessions to ordinary computers. - this is partially true, but the CL standard did omit some features found on Lisp machines which would be difficult to implement efficiently on conventional hardware. However none of this can change the fact that Common Lisp implementations are an order of magnitude faster than scripting languages.

  • Look at all these parentheses! - of course if you don't need metaprogramming, macros, and code-as-data, may as well not use Lisp.

There are many other common misconceptions. The "Lisp only supports linked lists" and "Lisp is unsuitable for writing real-world programs" memes seem to have died down in the last few years, but more myths have taken their place. This leads me to believe that the main problem the Lisp community faces to ensure wider adoption (assuming wider adoption is what they want, which is not necessarily the case) is not a technical one, or an internal social problem as some suggest, but rather an education problem.

Most people simply don't know anything about Lisp, but feel qualified to discuss it and criticize it. Like I said, I'm not a member of the Lisp community (whatever that may be), but when advocating Lisp, Lispers should remember that a lot of things that they take for granted (sane language semantics, C library interface that works, metaprogramming, good performance from a high-level language) are completely foreign to most programmers! There's a communication gap.

And to all the programmers learning new languages, not just specifically Lisp, please hold off criticism until you know the language and have completed a significant project in it. If you did not get that far, don't criticize the language, and put it down to lack of time, lack of motivation or personal failing.

I can rag on Java all I want because I know it better than most Java advocates. But if you were first exposed to Lisp 2 weeks ago, then don't make yet another blog posting about parentheses, even if you work for Google. You're just wasting bandwidth.

Wednesday, November 15, 2006

Archaic Factor screenshots

Factor's UI has come a long way since the early days:

Some features were abandoned:

Doug Coleman's weblog

Doug Coleman now has a weblog. His first entry talks about compiling Factor on Windows from darcs. This involves setting up ssh, darcs, and msys.

Reminder: Factor talk in Montreal

I will be presenting Factor at the Montreal Scheme/Lisp User's Group next Wednesday, November 22nd.

I'm slowly putting together a presentation. Slides are defined using Factor help markup using a custom stylesheet, and everything is displayed by the Factor UI. So far I have 248 lines of code, most of which is the markup for slides. I'll put it in examples/mslug-talk when it is done.

Tuesday, November 14, 2006

Factor 0.86 released

Version 0.86 of the Factor programming language is now out.

Windows and Mac Intel packages were provided courtesy of Doug Coleman (thanks Doug!)


+ Core
- Improved memory management code leads to reduced memory consumption, less frequent garbage collections and fixes a few corner cases where Factor could run out of heap even if a GC would have freed enough memory to proceed
- Improved prettyprinter low lays out code in a more pleasing manner
- Windows native I/O has been sped up (Doug Coleman)

+ UI
- Double and triple clicks are now recognized, and can be used to select text in the editor gadget
- Windows now update while being resized on Windows

+ Stack effect inference
- The new infer. word should be called instead of infer, since it presents information in a more pleasing way
- Stack effect inference now infers variables read and written by a quotation to facilitate code comprehension and debugging

+ Module system
- The syntax for PROVIDE: has changed, consult the documentation
- Modules can now provide a main entry point, see run-module and MAIN:
- Modules can now provide documentation, see Documenting modules

+ Contributed libraries
- New contrib/cpuinfo (Doug Coleman)
- Updated contrib/match (Chris Double)
- Updated contrib/parser-combinators (Chris Double)
- Updated contrib/postgresql (Doug Coleman)
- Updated contrib/process (Doug Coleman)
- Updated contrib/sqlite (Doug Coleman)
- Updated contrib/textmate (Benjamin Pollack)
- Updated contrib/tuple-db (Doug Coleman)
- Updated contrib/vim (Doug Coleman)
- Updated contrib/xml (Daniel Ehrenberg)

Haar wavelet transform in Factor

Here is the Haar wavelet transform implemented in Factor. This is only the 1-dimensional version:
: averages ( seq -- seq )
[ first2 + 2 / ] map ;

: differences ( seq averages -- differences )
>r 0 <column> r> [ - ] 2map ;

: haar-step ( seq -- differences averages )
2 group dup averages [ differences ] keep ;

: haar ( seq -- seq )
dup length 1 <= [ haar-step haar swap append ] unless ;

Sunday, November 12, 2006

Some developer tools in the incubator

I have a small file of "work in progress" developer tools which I'm not releasing yet, which so far includes:
  • Calls in/out at the vocabulary level, rather than individual words
  • Finding words which read or write a given variable
  • Shortest path between two words
  • Searching for unloaded modules on disk

Some of these will eventually make it into the core. I also want to develop a graphical cross-referencing tool in the UI which combines various words one can invoke in the listener into one coherent interface.

Taming variables

I developed some new tools which perform static analysis of variable usage.

Factor's variables are dynamically scoped and very simple, there is a global namestack holding hashtables. Nesting a scope pushes a hashtable on the stack, variable lookups search up the namestack, and variable assignments store the new value in the topmost hashtable on the namestack.

The infer word has been replaced with an infer. word which gives variable usage information obtained by statically analyzing the code. Those in the know will recognize that this is a ripoff of InterLisp's MASTERSCOPE package. Here is an example:
[ 3 * x set ] infer.

* Stack effect:
( object -- )
* Writes free variables:

But if you wrap this in a with-scope, the assignment is lost as soon as the scope ends and there is no observable side effect:
  [ [ 3 * x set ] with-scope ] infer.
* Stack effect:
( object -- )

On the other hand, reading a variable from a nested scope might go up a scope if the variable is not defined, and so it can be observed from the outside:
  [ [ x get ] with-scope ] infer.
* Stack effect:
( -- object )
* Reads free variables:

But if you assign the variable in scope first, then we cannot observe the read:
  [ [ 5 x set [ x get ] with-scope ] with-scope ] infer.
* Stack effect:
( -- object )

Now, of course because this is built on stack effect inference, everything works if you have a word which indirectly calls get or set.

It even understands combinators, for example this one that applies 2 * to the value of x:
  [ x [ 2 * ] change ] infer.
* Stack effect:
( -- )
* Reads free variables:
* Writes free variables:

No special support for inferring change is defined, it just looks at the definition of change;
: change ( variable quot -- ) 
>r dup get r> rot slip set ; inline

Finally, it has special support for recording words which directly read and write globals:
  [ [ 5 x set-global ] with-scope ] infer.
* Stack effect:
( -- )
* Writes global variables:

Saturday, November 11, 2006

strongForth 1.1 released

strongForth is a statically-typed low-level Forth-like language. This release adds a floating point wordset. Right now it only runs under DOS (its actually a very old project), hopefully it will be ported to 32-bit Unix platforms at some point in the future.

Insane Cocoa bug

So it seems the frame method of NSWindow returns some client-side value, which is updated when the window receives events notifying it that it has been moved or resized. Which is all very well, but it seems that if you move the window around while it is not responding, then the frame method returns the old value! This is absurd.

It means that for example if you reload some files in the Factor UI listener, which then causes a 10-second recompile, during which you move the window, then mouse handling is completely screwed up after Factor starts responding again.

Does anybody know if there is a way to query the window system for the window's actual location?

Update: This only happens under certain circumstances. Very weird. I'll try to narrow it down.

Update 2: Actually it is pretty easy to reproduce. Start Factor and type 30000000 [ ] times (or anything else which takes several seconds to run). Then move the listener window around for a few seconds, then switch to another application. When Factor is done, switch back, and it will not respond to clicks because it will think the mouse location is different than what it actually is.

Friday, November 10, 2006

Factor 0.86 is ready

Factor 0.86 is now ready for release. As a few people on the mailing list have requested, I'm making this announcement before actually releasing Factor so that the code in darcs can get some extra testing. I'll be testing the X11 UI backend and AMD64 compiler backend tomorrow.

Thursday, November 09, 2006

You know you're too lazy to learn better tools when...

You try to justify the flaws in what you currently use on the basis that these flaws, while forcing you to do a lot more work than necessary and increasing turn-around time, encourage you to "design better".

Sad people

Some people are so blinded by Java zealotry that they choose to make a weblog dedicated to personal attacks on comp.lang.lisp posters.

Why Java zealotry you ask? I think a trait unique to Java programmers is to go antagonize other language communities, pretending to be interested in learning that language, when in fact they're just looking for excuses not to learn, and to criticize.

XML parser now supports namespaces

Daniel Ehrenberg has updated the contrib/xml module to support XML namespaces, with XML-RPC support coming up next. Dan used to be active in the Factor community six months or so ago, and now he's back, which is great. Unlike some open source projects, Factor contributors seem to stick around for a long time. When I was working on jEdit, a big problem was people throwing a few patches and plugins over the fence, then disappearing a week later...

Inline allocation refactoring

On PowerPC and x86 processors with SSE2, Factor supports floating point intrinsics, and the final boxing of a floating point value is entirely open-coded, so no GC check was done. Fixing this was the final stage of the Allocation refactoring.

At the end of a basic block, the compiler calls end-basic-block, which boxes any floating point values stored in registers, and moves them to the data stack. It also moves all other integer values and pointers to the stack too. Within a basic block, no stack writes are performed, and stack shuffling takes place at compile time in the form of register renaming. The fix for the inline allocation GC issue was to simply leave 1kb free at the top of the heap at all times, and if any float values were boxed, compile a call to a C function which performs a GC check in end-basic-block. This smells like the old issue that the entire GC refactoring was supposed to avoid (the GC on 75% heap consumption being inadequate) however very little memory is allocated by boxing floats, much less than 1Kb.

However there was one sticking point; in a handful of locations, end-basic-block would be called to generate machine code in a context where certain register values are not to be clobbered. For example, conditionals worked this way: first, the value at the top of the stack would be popped (which might not generate any code at all, if it is in a register), then the basic block would end so registers would sync with the in-memory stack, and then the conditional jump was performed based on the value of this register. This would no longer work, since end-basic-block could now generate a call into C, and thus the register holding the condition would be clobbered. The fix is not to end the basic block at a conditional jump, but instead "split" the virtual register representation of the stack at both branches. So for example if r3 held some known value before a conditional, the compiler could reuse this value in either branch without reading it back in.

Another place where the compiler called end-basic-block was when generating fixnum intrinsics which could overflow. These open-coded intrinsics are more efficient than calls into C, since in the common case of there not being an overflow, the code path re-uses registers rather than reading from the stack, and so on. For example, fixnum+ adds two integers, and if the result overflows, it allocates a bignum. Again, the problem was that it would read two values from the stack into registers (which might not have generated any machine code at all, if the value were already in registers), then it would end the basic block, then it would add them, and if an overflow occurred, it would call into C to allocate a bignum. This worked in the past, but now that the end of a basic block can call GC, the fix here is to open-code the bignum allocation code, and therefore avoid the need to end the basic block before fixnum+ is assembled in.

The other cases where end-basic-block were called remain untouched; namely, before jumping to another word definition, before returning from a word definition, and before an explicit alien call into C. In these cases, no values were held on in registers so a GC is safe. The removal of calls to end-basic-block from around conditionals and overflowing fixnum primitives means the compiler can keep more values in registers, generate less code to shuffle values between registers and memory, and box less floats. And of course, it also means that certain operations won't run out of memory even if plenty is available, simply because no GC check gets done.

I still have to fix the fixnum* intrinsic on both platforms; this results in an inline allocation of a 2-cell bignum, which is trickier than 1-cell bignum like fixnum+. I'm also hunting down a compiler bug which results in some words from contrib/crypto being miscompiled. Once these two items are done, I will release 0.86.

jEdit 4.3pre8 released

The jEdit developers have released version 4.3pre8.

Saturday, November 04, 2006

Article about the Cat programming language

Christopher Diggins has written an article about the Cat programming language which describes the implementation in some detail. Cat is a statically-typed stack language.

Memory allocation redesign pushed; 0.86 almost ready

The patches implementing the redesign I talked about a few days ago are now in the repository. There is one remaining instability, and that is inline float allocators do not check if a GC is necessary yet. So it is possible to run out of memory that way. I will fix this shortly. Next I will be updating documentation and fixing a few minor bugs, and then releasing 0.86!

Annotations instead of C code for JNI

A Java programmer has realized that JNI is a steaming pile of shit. He writes:

The more I think about it, the more I think something like the following would make sense:
public void bar();

Damn right it makes sense. There is no good reason one should have to program in C, and learn the internals of their language runtime, to call C libraries, even those involving structures, callbacks, and so on.
The Java OpenGL binding is 40,000 lines of Java code, including some hairy Java and C source generation, and a complicated build process (when I was doing my Java game, I never managed to get JoGL to compile, even after downloading half a dozen dependencies and playing with the build.xml. I always used the binary packages).
The Factor OpenGL binding is less than 2,000 lines of code consisting of a handful of source files. We also have bindings to FreeType, Cocoa, WinAPI, a USB library, and various other things, all implemented in Factor, without huge amounts of boilerplate or C code.

Friday, November 03, 2006

Programming languages

If you want a programming language implementation, you have to pick two out of three features:
  • Good performance
  • Expressive language
  • Popular

I know which two I care about. Assembly only satisfies the first one, Common Lisp satisfies the first two, Java satisfies the first and last, and of course Ruby has the last two but its performance sucks. No language I'm aware of has all three at this stage of human history.

Thursday, November 02, 2006

Memory allocation refactoring

Factor's garbage collector moves objects around in memory. This means that if there is not enough memory to satify an allocation request, the garbage collector has to be able to find all pointers to live objects and update them if necessary.

Until now, memory allocation in the Factor runtime did not check to see if the heap was full:
INLINE void allot_barrier(CELL address)
F_CARD *ptr = ADDR_TO_CARD(address);
F_CARD c = *ptr;
CELL b = card_base(c);
CELL a = (address & ADDR_CARD_MASK);
*ptr = (card_marked(c) | ((b < a) ? b : a));

INLINE void *allot_zone(F_ZONE *z, CELL a)
CELL h = z->here;
z->here = h + align8(a);

return (void*)h;

INLINE void *allot(CELL a)
return allot_zone(&nursery,a);

Instead every primitive which could at some stage allocate memory would simply first check if the heap was 75% full, and if so, call the GC. So the idea was that we call the GC early enough, and hope that the heap does not completely fill up until the next GC check. If the heap did fill up, a guard page would be hit, but all that we could do was kill Factor, since calling the GC from the signal handler is out of the question -- suspended stack frames could contain C local variables pointing to Factor objects.

Indeed, the GC check was always performed at points where no Factor object could be stored in a C local variable, so this made things very simple. The disadvantage is that you cannot use more than 75% of the heap, every primitive has to remember to call the GC check, and if some primitive enters with 74% of the heap full and suddenly allocates a large amount of memory, Factor would die with an 'out of memory' error even if after a GC there would be enough. Also this approach made it hard to implement a growable data heap -- at what point do you grow? If the GC check did not free enough memory, you could grow the heap there, but if the GC check did free enough memory but the primitive subsequently allocates a huge array, you'd run out of memory instead of growing the heap. So the manual GC check, and restriction on performing a GC when C variables contain roots, had to go.

So I decided to redo things, and instead potentially perform GC from the lowest-level allocation functions. The allot() function now looks like this:
INLINE void maybe_gc(CELL a)
if( + a > nursery.limit)

INLINE void *allot(CELL a)
return allot_zone(&nursery,a);

This means that before calling a runtime function which can potentially allocate memory, the C code has to register any local variables which can contain objects as GC roots. For example, here is the low-level function implementing the <array> primitive:
F_ARRAY *allot_array(CELL type, F_FIXNUM capacity, CELL fill)
int i;
F_ARRAY* array = allot_array_internal(type, capacity);
for(i = 0; i > capacity; i++)
return array;

The REGISTER_ROOT macro adds the object to a stack, internal to the runtime, and the UNREGISTER_ROOT macro removes the topmost object and stores it back in the given variable. The GC scans this stack for roots just like it does with the data stack, etc. This stack is not accessible from the Factor side, and if an error is thrown between calls to REGISTER_ROOT and UNREGISTER_ROOT, its contents are reset.

There are only a few cases where GC roots must be registered this way, and most of them are in the bignum code, especially after I simplified a lot of runtime C code and moved a number of primitives from C code into the Factor library. However forgetting to register a root can have disastrous consequences, namely random, hard to reproduce crashes (after all, 99% of the time a given allocation call does not GC). I developed two testing tool to help catch this:
  • After a GC, overwriting the former semispace with junk data so that reading from it immediately causes bogus behavior, rather than subtle corruption
  • Modifying the allocation function in the runtime to call the GC every time, or every 100 allocations

These two debugging features slow things down considerably, but have helped me find a number of problems and I'm confident I'll get the root registration 100% correct this way. The first item will be available to users as the -S command line switch; it was actually first suggested by Doug Coleman in the context of cryptography algorithms, where you don't want your private key to hang around in memory longer than needed in the event of your swap file being stolen (although this seems way too paranoid for me). Obviously the second feature will be removed before release and I don't intend to add a switch to GC on every allocation after this refactoring is done.

I've almost finished getting everything working again. Compiled alien calls and callbacks need to be updated now, since the segment of code generated by the compiler to convert Factor objects to C values (and vice versa) now needs to accomodate the fact that objects may be moved around in the middle of this process. Also in one place, the runtime's allot() function is bypassed, and allocation of a boxed float is done manually in a compiled intrinsic; this ensures that this can be done in a "basic block" without clobbering register contents, improving the efficiency of compiled floating point code.

Interestingly enough, this refactoring has shaved several hundreds of lines of C code from the runtime. It gave me a chance to review the C code and clean it up and remove some redundancies.

Once everything is stable and I fix some minor bugs on my to do list, I will release 0.86. I will not implement a growable data heap until 0.87, however with this refactoring it has become much easier to do.

Tuesday, October 31, 2006

Ruby falls short

An intesting post which mirrors my feelings about Ruby exactly: the syntax is too complex, and a lot of stuff that should be library is hard-wired into the language. And Ruby blocks are inconsistent and limited. I still think Ruby is decent, but if it had better performance by an order of magnitude, and cleaner syntax and semantics, it would be killer.


If you want to try the latest Factor development code, but are using a platform supported by Factor but without a ported Haskell environment (for example, Solaris/x86 -- this may be the only such platform) then you can try cl-darcs, implemented in Commmon Lisp.

Saturday, October 28, 2006

Friday, October 27, 2006

Cat programming language 0.9

Christopher Diggins has released Cat 0.9. This version adds a new unit test suite intended to shake out bugs from the static type inferencer, and also now works on Mono.

Monday, October 23, 2006

Factor talk at MSLUG, November 22nd

I will be presenting about Factor at MSLUG, the Montreal Scheme/Lisp Users Group, on Wednesday November 22nd.

Saturday, October 21, 2006

Module documentation

I have changed the syntax for module definitions; here is the one for concurrency to demonstrate the new syntax. Note that you can now specify a main help article for your module, and I encourage everybody to document their modules now:
! Copyright (C) 2005 Chris Double. All Rights Reserved.
! See for BSD license.
REQUIRES: contrib/dlists contrib/serialize contrib/match ;

PROVIDE: contrib/concurrency
{ +files+ {
} }
{ +tests+ {
} }
{ +help+ { "concurrency" "concurrency" } } ;

What Factor is missing now is a nice graphical tool to browse modules, load them, run their main entry points, jump to module documentation, and jump to source files making up the module. This will come in 0.86.

Double-click support in UI

Me and Doug Coleman implemented double click recognition in the UI. While Windows and Cocoa have native APIs to get multiple click counts from events, Windows only reports double clicks, not triple clicks, and X11 doesn't have this at all. So this is done in a purely cross-platform manner, except the system default double click timeout is used on Windows.

Basically before a button-down gesture is sent, the global hand-click# variable is incremented if necessary. Here is an example from the editor gadget:
: editor-mouse-down ( editor -- )
hand-click# get {
[ ]
[ dup position-caret ]
[ dup T{ word-elt } select-elt ]
[ dup T{ one-line-elt } select-elt ]
} ?nth call drop ;

The click count is used to index an array of quotations. Nice and neat. Then we add this command to the "caret" command table:
editor "caret" {
{ "Position caret" T{ button-down } [ editor-mouse-down ] }
{ "Previous character" T{ key-down f f "LEFT" } [ T{ char-elt } editor-prev ] }
{ "Next character" T{ key-down f f "RIGHT" } [ T{ char-elt } editor-next ] }
... more commands ...
} define-commands

Thursday, October 19, 2006

Module main entry points

Modules can now define an entry point. Here is the updated contrib/tetris/load.factor:
! Copyright (C) 2006 Alex Chapman
! See for BSD license.

REQUIRES: contrib/lazy-lists ;

PROVIDE: contrib/tetris {
"tetris-colours.factor" "tetromino.factor" "tetris-piece.factor"
"tetris-board.factor" "tetris.factor" "tetris-gl.factor"
} {
"test/tetris-piece.factor" "test/tetris-board.factor" "test/tetris.factor"
} ;

USE: tetris-gadget

MAIN: contrib/tetris tetris-window ;

This means with the latest darcs code (not 0.85) you can now run Tetris simply by starting Factor and entering the following:
"contrib/tetris" run-module
The module is loaded first (unless it has already been loaded).

Factor 0.85 released

Factor 0.85 is now available. Check out the change log. Thanks to Doug Coleman for the Windows package and Alex Chapman for the Mac OS X Intel package.

Chris Double found a silly regression: the contrib/furnace module does not load. Now this module is not very useful yet, but nevertheless, I did ensure that all modules load before releasing 0.85, then went ahead and made some changes to Furnace without testing if they load in a fresh image. How foolish. Anyway if you feel like taking a look at the new incomplete web framework, you should use the latest code from the Darcs repository instead.

Embedding a stack language in Haskell with static type checking

Kevin Reid has been playing around with this.

Factor tetris

Alex Chapman (author of the OpenGL binding and gap buffer data structure) has submitted a tetris game written in Factor to contrib. To run it, you probably need Factor from darcs, and then:
"contrib/tetris" require
USE: tetris

It looks like this:

The implementation uses an infinite lazy list to draw the next piece from. That is just so cool.

Tuesday, October 17, 2006

You know you're a Forth (or Factor) geek when... type .s in bash to get a file listing. I do this all the time, and a few other people on IRC mentioned the same thing. It's funny.

Parser combinators example

Chris Double has put together a lovely example of parsing s-expressions using his parser combinators library.

Parser combinators are one of the Haskell community's nicest contributions to computer science at large.

Monday, October 16, 2006

Working on a Factor web framework

Factor has some components of a web framework already:
  • contrib/httpd - basic request processing
  • contrib/embedded - reads a file, evaluates Factor code between <% and %>, and dumps the result
  • contrib/xml

I'm working on a web framework named Furnace which builds on these components and adds features to help construct typical CRUD web applications. It is located in contrib/furnace and is almost non-existent at this point, but my goals are to have it do the following:
  • Action processing, forwarding
  • Form validation
  • HTML component management
  • Templating

I still haven't decided how templating will work in the end:
  • Right now I'm using contrib/embedded because it is simple and the code is there
  • Pure-HTML templates marked up with 'id' attributes (like Wicket)
  • XML templates, mixing HTML with custom Factor tags but no Factor code

The last two approaches have potential to be cleaner, because right now the contrib/embedded templates feel boilerplate-ish:
<tr><th>Paste by:</th><td><% "author" get write %></td></tr>
<tr><th>Channel:</th><td><% "channel" get write %></td></tr>
<tr><th>Created:</th><td><% "date" get write %></td></tr>

I think some custom tag, or "id" attributes on the <td> tags would be prettier.

So far the example web app I've been building in parallel with the framework is a pastebin. Once the framework has enough features to make the pastebin support all the necessary features (navigation, validation, etc) with pleasing and simple code, I want to build a wiki and blog, and then eat my own dog food, using Factor to host this web log. I expect these CRUD-style web apps to be trivial to build once the framework is done.

Note that I haven't said anything about persistence yet. For a pastebin, a global variable holding a list of tuples is enough, but obviously for a wiki or weblog this is horribly insufficient. We have code in contrib for interfacing with PostgreSQL and SQLite, Chris Double has a simple O/R mapper that maps Factor tuples to SQLite table rows, and Doug Coleman has expressed interest in extending this to a more general O/R mapper. Pending some kind of miracle whereby an object database written in Factor suddenly appears, the preferred method of persistence in Furnace will be an SQL database together with an O/R mapper.

I plan on integrating Chris Double's work on continuation-based control flow, however the standard "REST" style will be preferred in most cases. Chris Double has also been working on a Factor library for writing web applications using the Comet style. I know very little about JavaScript, but integrating this with Furnace, along with other client-side features such as validation and completion would be nice.

Crash while running unit tests caused by unit tests

I spent an hour last night on a weird crashing bug that manifested itself on Windows: if you ran test-modules from the UI a few times, it would eventually crash Factor in unpredictable ways. I narrowed this down to the test for I/O buffers, which are low-level character queues, allocated in malloc() space, for when the Factor implementation needs to call native I/O functions with a pointer which will not be moved by the GC.

Now one of the unit tests created two buffers from a pair of strings, appended one to the other, then converted the result to a string, and compared the result against the expected string. Unfortunately, the word to append them was written rather carelessly. It used memcpy to copy the contents of one buffer to another, without checking bounds and growing the buffer first. The result? Random crashes, yet amazingly they never appeared on Linux or Mac OS X.

Lessons learned? Well, I knew all of these already, but this incident underscores them:
  • Pointer arithmetic is dangerous. Fortunately I don't believe in "hair shirt" programming, so Factor makes minimal use of unsafe constructs, only resorting to direct memory manipulation when calling C code which wants us to deal with C data. User programs never have to step outside the memory safe sandbox.
  • When unit tests fail, it doesn't necessarily imply the code being tested is buggy. The unit tests could be broken too! The reason it took me a long time to track down the bug is simply that I did not think to check the unit tests themselves, since after all, the buffers test passed most of the time.
  • Testing on multiple platforms helps weed out bugs.

Saturday, October 14, 2006

Common Lisp IDE built on Eclipse

I know some people don't like the dated user interface of Emacs, so Cusp might be interesting. This is a plugin for Eclipse which uses the same backend (Swank) as SLIME for Emacs and provides similar features, however at this point it looks a lot less mature than SLIME.

Using the X11 UI on Mac OS X

One of the mailing list participants noted that the Cocoa UI does not run on Mac OS X 10.3. Getting this working is not a high priority, since after all 10.4 has been out for quite some time and I'd rather focus on new features and bug fixes than backwards compatibility.

However I did make it easy to build Factor so that the X11 UI may be used. First you compile with an extra flag passed to make:
make macosx-ppc X11=1
Then you bootstrap with two special switches:
./f boot.image.ppc -no-cocoa -x11
That is all. You do not need to build a as with the Cocoa UI, you can just set $DISPLAY and run ./f.

Friday, October 13, 2006

Interface builder, Cocoa, and interactive development

I went ahead and scrapped the Factor code which defines Factor's menu bar on Mac OS X in favor of using Interface Builder to set the menu bar up. I discovered that Cocoa is fully capable of reloading nib files at runtime.

Factor can subclass Objective C classes. Until now however, the definition would only be entered if the class did not already exist, so if you changed some method definitions and reloaded the source file which contains them, the methods would not be visible to the Objective C runtime until you restarted Factor. I fixed this so that method definitions are replaced on the fly when a new Objective C class is defined in Factor, even if the class already exists in the runtime.

Together with the nib reloading, I can add a menu item together with the corresponding action, and observe the result instantly.

Factor is now a better Objective C (and XCode) than Objective C (and XCode).

Monday, October 09, 2006

Three cheers for C syntax

I found this lovely snippet in a C application which shall remain nameless:
void interpreter( void ) { for(;;) (*(prim *)(intptr_t)*(cell *)(intptr_t)(*++lc))(); }

So, which pairs of brackets correspond to casts, grouped expressions, and function invocations? What is the operator precedence between casts, dereferencing, and pre-increment? I really don't know.

Here is the PowerPC disassembly:
0x00002488 :     mflr    r0
0x0000248c : bcl- 20,4*cr7+so,0x2490 interpreter+8
0x00002490 : stmw r30,-8(r1)
0x00002494 : mflr r31
0x00002498 : stw r0,8(r1)
0x0000249c : stwu r1,-80(r1)
0x000024a0 : addis r2,r31,0
0x000024a4 : lwz r30,23748(r2)
0x000024a8 : lwz r2,0(r30)
0x000024ac : addi r2,r2,8
0x000024b0 : stw r2,0(r30)
0x000024b4 : lwz r9,4(r2)
0x000024b8 : lwz r0,4(r9)
0x000024bc : mr r12,r0
0x000024c0 : mtctr r0
0x000024c4 : bctrl
0x000024c8 : b 0x24a8 interpreter+32

Now if I wanted to figure things out, I could spend 5 minutes going through every instruction and keep track of registers with a paper and pen, but at least its possible. Not so with the C code.

Like watching a puss-filled boil explode in slow motion

Reading this (which is only the latest in a long saga of closure debates on Java sites) makes me glad I don't program in Java anymore. I personally think it is too late to add closures to Java, simply because the existing libraries are not designed for it, but the responses posted in discussions such as these are so comical that I want to see closures added just so that these guys get pissed off.

Come on now enterprise monkeys, make up your mind: are closures simply for newbie "script kiddies" who are not man enough to use a 400$ keyboard macro IDE? Or are they so complex that newbies cannot learn them (nevermind the extremely convoluted anon inner class syntax). Or are they easy to learn but simply unsuitable for serious Enterprise(tm) team development?

This is sort of like watching assembly freaks debate if parameters should be passed in registers or on the stack... or if subroutine calls are any good at all. Those of us in the know have moved on long ago.

Enchilada programming language

The Enchilada language now has a home page.

Saturday, October 07, 2006

Swapping two variable values

I thought this was kind of nifty when I first thought of it:
: swap-vars [ [ get ] 2apply ] 2keep [ set ] 2apply ;
For example:
5 "x" set
7 "y" set
"x" "y" swap-vars
"x" get "y" get 2array .
==> { 7 5 }

Completion and right-click popups in the UI

I've been pretty busy with my mathematics research lately as I've managed to prove some pretty nice results. However things have not all been quiet on the Factor front.

Documentation for the below features is forthcoming, but for now the below is as good as it gets. You can press TAB to get a keyboard-navigatable, automatically updating, buzzword compliant completion popup:

If you press Up/Down to select a word followed by Enter, the word name is inserted in the input area. Instead you can right-click on the word to see a menu of other commands.
Now if you press C+e, you get the same popup, but instead it completes loaded source file names; again pressing Enter opens the file in your editor (if you've previously loaded a module such as contrib/jedit or contrib/emacs) and a right-click offers another option, running the file -- note that this works for loaded contrib modules too, not just core library files, and in the future there will be a more extensive integrated module browsing tool than this:

Finally, you can press C+u to get a vocabulary completer, where the default operation is to USE: the vocabulary and from the popup menu you can also browse it or IN: it -- this screenshot shows the popup menu (yes its ugly):

Some of you will remember that the Factor UI used to have popup menus, then they disappeared, then in the recent darcs release I started using the three mouse buttons to invoke different object operations, and now menus are back again. I have a hard time making my mind up on certain issues, but my code does tend to slowly converge on a preferred approach. In either case, I think its better than picking an approach and religiously sticking with it, despite observations that it may be better to do things differently.

Tuesday, October 03, 2006


I didn't realise bigForth had a good compiler until I saw this. It does a lot better in this benchmark that gForth did.

Sunday, October 01, 2006

Single stepping

With a stack language, single stepping has a certain intuitive appeal, and there is no ambiguity when it comes to what will happen next. Single stepping through C, on the other hand, is an exercise in frustration. Suppose you have a piece of code like:
If I want to step over the calls to bar() and baz(), but step into foo(), it appears that I'm out of luck, at least in gdb. You can set a breakpoint on foo(), or just step into the nested function calls and then step out, but it wastes time and breaks concentration.
I like having a programmer-visible stack. It seems to shorten the 'round trip time' between my brain and the computer.

Saturday, September 30, 2006

Cat programming language 0.7.1

Christopher Diggins has released Cat 0.7.1. Cat is a statically-typed stack language implemented in C#. This release adds working type inference.

Annoying CS PhD students

Every few weeks, I receive an e-mail much like the following:
I came across the jEdit project at

I am a doctoral student at the Department of Computer Science, Florida
State University. As a part of my dissertation, I have developed a set
of metrics and a mechanism to track the effects of changing
requirements on software design - we already have several
publications on these ideas. We are looking to apply the mechanism as
a case study on your project. In addition to the material available at
SourceForge could you provide the requirement and design documentation
to help us understand the system better ?

If this was a one-time thing, it wouldn't bother me at all. But it seems that large numbers of PhD students simply spam the entire SourceForge web site with queries much like the following:
  • Can I have design documentation?
  • Does your team use XP, Agile, SOA methodology?
  • Can you please fill out this survey?
  • etc

I will not write your thesis for you. Tell your supervisor that spamming open source projects is not the right way to do research. If you want design documentation, read the fucking source code, ask questions about how it works if you need to, but don't expect me to do your work for you.

Or at least, don't tell me that you're wasting 4 years of your life doing a PhD in computer science.

Thursday, September 28, 2006

Code GC works

Mark and sweep GC is not very hard. The only thing that remains to be done is to automatically invoke code GC when the code heap is full. Presently there is a code-gc word one must call.

Sunday, September 24, 2006

Code GC

I've started work on a garbage collector for the compiled code heap. A code block is eligible for reclamation when the word owning the code block is no longer referenced in the data heap, and when no other compiled code blocks reference the code block. So if you redefine a compiled word, or forget a word, the old compiled block will be reclaimed.

Code GC will use the mark and sweep algorithm, without moving the code blocks around, in contrast to the generational copying GC used for data. The reason code cannot be moved around reliably is callbacks -- C code may hold on to pointers to Factor code. The heap will be compacted when the image is saved.

Of course if a callback refers to a word which has been GC'd, your program will crash if it attempts to invoke the callback. The solution is "don't do it, then" -- if a callback is actively held on to by C code, the consequences of forgeting or redefining it are undefined.

Friday, September 22, 2006

JNI bridge

Chris Double is working on a JNI bridge for Factor. So far it only provides wrappers for JNI library functions, but a high-level interface will follow. This means Factor can now interface with three other languages directly:
  • C
  • Objective C
  • Java

Wednesday, September 20, 2006

Long computation

It took Factor 10 and a half hours to row reduce a matrix with 24 million entries, using exact rational math. The algorithm I use pretty naive, see this file.

Monday, September 18, 2006

Cat programming language 0.6

Statically-typed stack-based languages are few and far in between. In fact the only two I know of are:
  • StrongForth - based on Forth
  • Cat - based on Joy, and not really fully statically typed yet

A new version of Cat was released recently. It would be interesting to see these concepts ported to Factor in the form of a soft type checker, along the lines of MZScheme's MzFlow. Factor's stack effect checker is already most of the way there, and non-core libraries in contrib can extend it in various ways.

Saturday, September 16, 2006

Adding a new compiler optimization

I noticed that after inlining, the following type of code was coming up in a few words:
dup array? [ dup array? [ ... ] [ ... ] if ] [ ... ] if
That is, a value being tested against a type twice. Implementing a compiler optimization to deal with this was easy:
: known-type? ( node -- ? )
0 node-class# types length 1 number= ;

: fold-known-type ( node -- node )
dup 0 node-class# types first 1array inline-literals ;

\ type [
{ [ dup known-type? ] [ fold-known-type ] }
] define-optimizers

Faster I/O performance on Unix

The time to read a 606Mb file, 64 bytes at a time using the read word, went down from 458 seconds to 45 seconds after I made a few improvements to the Unix I/O code.

I've been doing a bit of work on I/O performance ever since the Factor I/O performance sucks entry. I expect another big speedup to arrive when I implement faster generic method dispatch, and compiled continuations. Hopefully by then Factor's I/O performance will be competitive with other languages.

Wednesday, September 13, 2006

Warning: User mode Linux mmap() is broken

On my linode, this program hangs instead of dying with a segfault:
#include <sys/mman.h>
#include <stdlib.h>

int main()
int pagesize = getpagesize();

int *array = mmap((void*)0,pagesize,



This is a serious issue. It causes Factor to hang on stack underflow if the underflow was caused by a memory read (eg, dup with an empty stack).

Of course I don't use stack underflow errors for control flow, but it makes debugging on the server rather annoying. Perhaps its time to find another hosting provider.

Tuesday, September 12, 2006

Patterns in programming languages

Another critique of "design patterns".
Patterns are signs of weakness in programming languages.

When we identify and document one, that should not be the end of the story. Rather, we should have the long-term goal of trying to understand how to improve the language so that the pattern becomes invisible or unnecessary.

Monday, September 11, 2006

Strongtalk is now open source

Strongtalk is a very fast Smalltalk implementation with an interactive environment and optional static typing. It only ran on Windows. Sun acquired the company in the 90's, used the technology to implement the Java HotSpot JIT and killed Strongtalk. But now it is open source and hopefully can be maintained again, or at least inspire other language implementations.

Sunday, September 10, 2006

Visualizing stack code

I implemented a nifty hack today:

It still looks ugly (both in presentation and code) and needs some tweaks.

Friday, September 08, 2006

TextMate integration

Benjamin Pollack created a TextMate bundle together with a Factor editor hook for TextMate. His work is in contrib/textmate now.

Wednesday, September 06, 2006

Better development workflow

Previously when you made changes to source files you were working on, you had to run-file them all by hand. There was also the reload word, which saved some keystrokes because it would reload the source file containing a definition. But now there's a new way which beats both.

Factor now tracks the modification time of every source file in every loaded module. So you can change source files in a loaded module to your heart's content, and then invoke reload-modules in the listener, or even better, just press F7 in the UI. This reloads all changed source files, in the correct order.

Tuesday, September 05, 2006

Serializable space invaders

Chris Double has got the serialization code to the point where it can be used to transport UI gadgets between Factor instances. He managed to serialize a running space invaders instance, and another participant on IRC got it running on their machine. Cool stuff!

As for me, I've started working on 0.85 with a few mundane cleanups and bug fixes. Nothing I do is cool anymore :)

On learning

Mike Vanier writes:
As somebody who loves to learn new programming languages and paradigms, I hate to admit this, but one of the biggest reasons bad languages persist is that most people hate learning new programming languages. They would rather stick to a shitty language that they more-or-less understand than take the one month or so to become familiar with a new language. This is an example of a very, very general phenomenon both in computing and elsewhere, which is that nobody ever wants to learn anything new.

Monday, September 04, 2006

Interesting weblog

A blogger calling himself "sigfpe" writes a fascinating weblog about mathematics and Haskell. I like his implementation of the Clifford algebras.

Sunday, September 03, 2006

Factor 0.84

Factor 0.84 is now available. Check out the change log; it is rather extensive.

Is American high school education really that bad?

Some of the responses here made me chuckle:

  • What is the number less than one closest in value to one?
  • I haven't actually read the proof, but I'm not convinced by it. 0.999... is definitely smaller than 1, or it would be called 1.

    I believe it is just common sense.

Illiteracy might no longer be a problem in the developed world, but innumeracy is rampant. :-)

Friday, September 01, 2006

Recent UI changes

I implemented a CLIM-like commands framework. There is now a keyboard help window (press F1) and a generic "operations" framework that allows commands for working with words to be applied to an input gadget, for example, which causes the word to be extracted from caret position. This is conceptually similar to CLIM presentation translators.

Also, the UI now builds all tools into a single workspace window. Multiple workspaces can be opened.

Here is a rather large screenshot.

Factor is now three years old

That's right, and if you read the blog post from a year ago, you'll see Factor has made a lot of progress.

Just for kicks I downloaded Factor 0.77. It took 34 minutes to bootstrap on my x86 machine. Current Factor releases bootstrap in 2 minutes 30 seconds on that machine. Not only has the performance of the compiler improved drastically, but the amount of optimizations it does -- and not to mention the volume of code being compiled -- has increased too.

A quick overview of just some of the major improvements in the last year:
  • The UI has improved a lot: OpenGL rendering, multi-window support, browser, graphical single stepper, etc
  • Hypertext online help, full text search
  • AMD64 port
  • Intel Mac port
  • Objective C library interface
  • Callbacks from C to Factor
  • Restartable exceptions
  • Formal stack effect declarations

In my Factor is two years old post, I gave some line number counts for Factor at the time:
  • Factor runtime, written in C: 7326 lines
  • Factor library, written in Factor: 17591 lines
  • Unit test suite, written in Factor: 4160 lines
  • Contributed code, written in Factor: 6598 lines

Here are the stats as of today:
  • Documentation, written in a Factor DSL: 94347 words
  • Factor runtime, written in C: 8261 lines
  • Factor library, written in Factor: 29555 lines
  • Unit test suite, written in Factor: 4772 lines
  • Contributed code, written in Factor: 24737 lines

Good to see the contributed section growing fastest of all. I hope the core library doesn't get too large, and that a year from now I can look back and say that Factor has again made a lot of progress.

Thursday, August 31, 2006

Distributed concurrency in Factor

Chris Double writes about distributed concurrency in Factor. This is built on top of his existing Erlang-like concurrency library, and uses serialization to pass messages between nodes. Cool stuff.

Wednesday, August 30, 2006

Scheme-based real time music programming environment for OS X

Impromptu looks interesting. I just wonder why it wasn't implemented in Scheme, since they already use it for scripting.

Object serialization

Long-time Factor contributor Chris Double has dusted off an incomplete serialization library and polished it up. You can find the code in contrib/serialize/. Now Factor can serialize objects to binary and back again, preserving object identity, shared structure and circularity. He mentioned he will be using this to extend the concurrency library to support distributed message passing. That is just too cool.

Monday, August 28, 2006

Friday, August 25, 2006

Generic editor hook

For a while now Factor had a jedit word in the core which would jump to a word definition in a running jEdit instance. Well since then, emacs and vim integration has appeared in contrib/, so I thought it is time to make this hook generic. Now there is a new edit word which calls a quotation stored in the edit-hook global variable with a file name and line number on the stack. The jEdit code has been moved to contrib/jedit/. Loading either the jedit, vim or emacs module sets the edit hook. There is also a new feature: the :edit word starts your editor at the location of the most recent parser error.

Thursday, August 24, 2006

New commands in the UI listener

Soon, there will be a CLIM-like unified command framework and keyboard help. For now, I'll just have to write a blog entry about the new keyboard commands:
Stack effect of entered quotationC+i
Single-step entered quotationC+w
Completions for word at caret (used vocabs)TAB
Completions for word at caret (all vocabs)A+a
Reload word at caretA+r
jEdit word at caretA+j
See word at caretA+s
See word at caretA+s
Use first vocab containing word name at caretA+u

The single stepper has some shortcuts too:
Step ini
Step outo
Step backb

"Step back" feature in single stepper

The new graphical single stepper in Factor 0.84 can now travel back in time and "step back" through a quotation. This only restores the stack contents, it does not undo side effects. Still, even though the time travel is far from perfect, it is still useful, especially for beginners learning the language and figuring out how stuff works. They can step over some words, then rewind and step in to look at things in more detail.

Friday, August 18, 2006

Even faster bootstrap

After the recent set of changes, Factor 0.84 bootstraps in 2 minutes 9 seconds on my Power Mac G5. This is an improvement of 21 seconds over Factor 0.83.

Wednesday, August 16, 2006

Better emacs integration

Eduardo Cavazos submitted a patch which adds a contrib/emacs.factor module with an emacs word, so you can jump to definitions in a running emacs instance:
\ reverse emacs
{ editor draw-gadget* } emacs

Eduardo also improved the factor.el emacs mode, adding keyword highlighting.

If the emacs integration keeps improving and gains SLIME-like features, I'll probably switch over. I'm getting sick of maintaining the jEdit Factor plugin -- the codebase is a mess (it descends from JFactor) and it feels like a waste of time since eventually Factor will have a structure editor in the image anyway.

More about stack effects

The see word shows the stack effect now. Also -- this patch isn't pushed yet -- but HELP: syntax is now different. Before when documenting words, you would write:
HELP: my-word "( a b -- c )" { $values { "a" "..." } { "b" "..." } } { $description "Foo" }
Now, the new style is either:
HELP: my-word { $values ...
HELP: my-word ( a b -- c ) { $values ...
That is, the stack effect comment is no longer given as a string, but written out literally, and it is optional. If no stack effect comment is specified in the help, the stack effect declared for the colon definition is used.

One thing about stack effect declarations I forgot to mention in my previous post is that if a word unconditionally throws an error, you need to declare this for the word to compile:
: foo-error ( -- * ) "Foo!!!" throw ;
If the output values list consists of a single *, then in fact the word can be compiled in a context where any number of outputs are expected; since it throws an error and execution does not continue past throw, the stack height is irrelevant by the time control flow reaches the word.

You cannot lie to the compiler, at least in a way which breaks stack effect inference and the optimizer. If you specify an incorrect stack effect for a recursive word, inference will fail with an unbalanced branches error. For non-recursive words, a stack effect check is performed at the end to ensure the inferred effect matches the declaration.

What you can do is declare an effect which is too deep:
: foobar ( a b c d -- a b c ) + ;
This word claims to touch 4 stack elements but it only touches 2. However the change in stack height for both the declared effect and the inferred effect is the same, and the word does not touch any values it does not declare, so the compiler lets this one slide. All it does is reduce potential for optimization in places where the word is called, since certain values will be assumed to have changed when in fact they have not.

Factor is still evolving quickly. However it feels like the core of the language is slowly coming together in a way I can be truly happy with, and this was true even before the big cons cell removal/first class quotations change in 0.83. I've mostly been concentrating on the compiler and UI lately and this is where most of the work remains.