Factor: a practical stack language: April 2007

Wednesday, April 25, 2007

Is reflective heap access even useful?

Charles Nutter is writing a Ruby interpreter in Java. He views Ruby's ObjectSpace feature as superfluous, because few libraries use it:

ObjectSpace is Ruby's view into the garbage-collected heap. You can use it to iterate over all objects of a particular type, attach finalizers to any object, look an object up by its object ID, and so on. In Ruby, it's a pretty low-cost heap-walker, able to dig up objects matching particular criteria for you on a whim. It sounds like it might be pretty useful, but it's used by very few libraries...and most of those uses can be implemented in other (potentially more efficient) ways.

Well of course few libraries use this feature -- it is intended for developers. Charles goes on to list a bunch of other reasons why he views ObjectSpace as undesirable; it can be dangerous when called from library code due to thread safety, etc. However, this misses the whole point. Features such as ObjectSpace are for developers to use interactively, not for libraries.

Factor has similar capabilities, and I use them regularly -- not every day, but perhaps once a week, at least. Here is just one example.

Yesterday I was debugging the compiler's interval inference code. I noticed that compiling some words, such as nth, allocated absurd amounts of memory (~30mb) in the optimizer stage. To see where the memory was going, I used the heap-stats. word to save a heap allocation breakdown snapshot before and after calling optimize on a dataflow IR graph which triggered the problem. Here is the allocation breakdown before optimization:

Class             Bytes   Instances
#>r               80      5
#call             1552    97
#dispatch         16      1
#if               512     32
#merge            528     33
#push             1568    98
#r>               80      5
#return           320     20
#shuffle          2208    138
#terminate        320     20
#values           720     45
alien             32      2
array             1996704 47446
article           9600    300
bignum            38088   1699
bit-array         8       1
button-down       168     7
button-up         120     5
byte-array        8       1
c-stream          48      2
c-type            896     16
char-elt          16      1
clipboard         48      2
command-map       528     22
complex           32      2
continuation      280     7
copy-action       16      1
cut-action        16      1
delete-action     16      1
doc-elt           16      1
drag              24      1
drag-timer        16      1
duplex-stream     32      1
edit-slot         16      1
effect            91904   2872
entry             24      1
float             3568    223
float-regs        72      3
gain-focus        16      1
gradient          144     6
hashtable         221264  13829
inferred-vars     32      1
int-regs          16      1
interval          240     10
key-down          1992    83
line-elt          16      1
line-reader       24      1
long-long-type    32      2
lose-focus        16      1
method            22152   923
module            768     16
motion            16      1
mouse-enter       16      1
mouse-leave       16      1
mouse-scroll      16      1
no-method         24      1
node              31616   494
one-line-elt      16      1
one-word-elt      16      1
operation         2200    55
paste-action      16      1
pathname          24      1
plain-writer      16      1
queue             24      1
quotation         510976  21677
ratio             320     20
sbuf              16      1
select-all-action 16      1
slot-spec         14600   365
source-file       15880   397
stack-params      16      1
string            1242848 16062
struct-type       72      3
temp-reg          16      1
tombstone         32      2
update-object     16      1
update-slot       16      1
value             3136    98
vector            10304   644
vocab-link        24      1
word              275880  6897
word-elt          16      1
wrapper           21280   2660

And after:

#>r               160      10
#call             1792     112
#declare          144      9
#dispatch         16       1
#if               512      32
#merge            528      33
#push             800      50
#r>               160      10
#return           320      20
#shuffle          1568     98
#terminate        320      20
#values           720      45
alien             32       2
array             2033576  48896
article           9600     300
bignum            35829360 1703
bit-array         8        1
button-down       168      7
button-up         120      5
byte-array        2584     129
c-stream          48       2
c-type            896      16
char-elt          16       1
clipboard         48       2
command-map       528      22
complex           32       2
condition         24       1
continuation      360      9
copy-action       16       1
cut-action        16       1
delete-action     16       1
doc-elt           16       1
drag              24       1
drag-timer        16       1
duplex-stream     32       1
edit-slot         16       1
effect            91904    2872
entry             24       1
float             3584     224
float-regs        72       3
gain-focus        16       1
gradient          144      6
hashtable         231248   14453
int-regs          16       1
interval          1224     51
key-down          1992     83
lexer             32       1
line-elt          16       1
line-reader       24       1
long-long-type    32       2
lose-focus        16       1
method            22152    923
module            768      16
motion            16       1
mouse-enter       16       1
mouse-leave       16       1
mouse-scroll      16       1
no-word           24       1
node              28160    440
one-line-elt      16       1
one-word-elt      16       1
operation         2200     55
parse-error       24       1
paste-action      16       1
pathname          24       1
plain-writer      16       1
queue             24       1
quotation         510992   21678
ratio             320      20
sbuf              16       1
select-all-action 16       1
slot-spec         14600    365
source-file       15880    397
stack-params      16       1
string            1243040  16064
struct-type       72       3
temp-reg          16       1
tombstone         32       2
update-object     16       1
update-slot       16       1
value             1600     50
vector            10368    648
vocab-link        24       1
word              275840   6896
word-elt          16       1
wrapper           21280    2660

The only major difference is in the space used by bignums; over 35mb worth! To get a better handle on what was going on, I obtained a list of all bignum instances from the VM, asked the VM for their size and printed out the resulting array:

[ bignum? ] instances [ size ] map pprint

All bignums were 16 or 24 bytes, except one which was 35mb! I had a hunch that this bignum was manifesting as a result of shift being called with a very high left shift count. Also, since the problem only occurred with interval inference enabled, I guessed that the interval-shift word was the problem. Setting a watch on this word with \ interval-shift confirmed my suspicions; a "data heap growing to ... bytes" message was printed right after interval-shift was called. Some more playing around in the listener revealed that the compiler was inferring a range for a value which had very large bounds, then this value was used as the shift count passed to shift, so the resulting range of values was exceptionally huge.

The fix is something I should have thought of in hindsight; if shift is applied to a value with a large range, the compiler should not attempt to compute the range of the result. Compiling something like 10 mod 100000000 * shift should not exhaust memory.

Now, there are two ways I could have arrived at this conclusion, and been able to fix the bug:

Do what I did above: using features such as heap introspection, word watches, and the listener to track down the problem in a very short amount of time

Banging my head against the wall, carefully reviewing every line of code I changed in the last few hours, and doing a "binary search" (adding/removing changed lines until I pinpoint the problem)

Because Factor has nice development tools, I was able to go for the first option. Of course even the second isn't that painful in Factor (there's no recompile/restart needed, even for most compiler changes).

I suspect the real reason Charles wants to see this feature removed from Ruby is that the JVM does not support it. Similar reasoning was used to propose that Ruby's upcoming Unicode support be crippled by using UTF16 to represent code points, because this is how Java's designers chose to do it in 1995. (The list goes on: some people don't like Ruby's continuations and think they should be removed, because, again, they're hard to get right in Java. Is anybody else bothered by this type of thinking?)

So, what's the point of this post? I just wanted to say, I'm glad I don't have to cripple my language because some guys at Sun made some bad decisions while I was still in elementary school.

Tuesday, April 24, 2007

Cat updates

Christopher Diggins is working on a turtle graphics library for Cat. Cool stuff.

Formal semantics of concatenative languages

Daniel Ehrenberg asks LtU readers a few questions about formalizing a minimal concatenative language.

Monday, April 23, 2007

Another gcc bug

If you're running Mac OS X, check your gcc version before compiling Factor. If you have this version, the resulting VM executable will be useless:

powerpc-apple-darwin8-gcc-4.0.0 (GCC) 4.0.0 20041026 (Apple Computer, Inc. build 4061)

The following version is OK:

powerpc-apple-darwin8-gcc-4.0.0 (GCC) 4.0.0 (Apple Computer, Inc. build 5026)

Thanks to Robbert van Dalen for first noticing this issue. Unfortunately there is no workaround. You have to upgrade your gcc.

gcc seems to have chronic issues with global register variables. This latest one is caused by gcc compiling a function to save and restore all registers in the range r13-r31, however r14 is the datastack pointer, which is clobbered as a result. Thankfully it is fixed in Apple's latest gcc, but I really wish gcc was more stable.

In my entire Java career, I never found a single bug which I could blame on the HotSpot JIT compiler. The Java language and libraries suck, but I really give the HotSpot team credit for writing a compiler which is fast, stable, and generates efficient code.

On the other hand, in only a few years of working on the C implementation of Factor, I've encountered plenty of gcc bugs. This is embarrassing for a project which claims to be the cornerstone of both Linux and Mac OS X.

Sunday, April 22, 2007

Immutable sequences

Robbert van Dalen is currently rewriting his Enchilada language implementation in Factor. Part of his work has yielded a reusable library, libs/isequences. This library implements immutable sequences which are stored as balanced binary trees, with efficient lookup, concatenation and slicing operations. An interesting approach, quite different from immutable linked lists and the Factor core's mutable sequences. Cool stuff.

Thursday, April 19, 2007

Interval arithmetic

I want to play around with improved overflow check elimination. The Factor compiler already does some elementary overflow check elimination, but is only applicable to counted loops iterating over arrays. I want to generalize this in order to speed up Chris Double's YUV to RGB conversion. YUV to RGB conversion performs a lot of integer additions and multiplications, none of which ever overflow to bignums. If the compiler can infer this fact, then it can replace generic arithmetic with machine arithmetic, resulting in a nice speedup. So as a first step I cooked up an interval arithmetic library, which represents a closed interval as a pair of numbers. I was pleasantly surprised at how simple it was:

: cartesian ( seq1 seq2 -- seq3 )
    [ swap [ swap 2array ] map-with ] map-with concat ;

: interval-op ( i1 i2 quot -- i3 )
    -rot cartesian [ first2 rot call ] map-with
    dup infimum swap supremum 2array ; inline

: >int ( n -- interval ) dup 2array ;

: int+ ( i1 i2 -- i3 ) [ + ] interval-op ;

: int- ( i1 i2 -- i3 ) [ - ] interval-op ;

: int* ( i1 i2 -- i3 ) [ * ] interval-op ;

: int-shift ( i1 n -- i2 ) >int [ shift ] interval-op ;

: int/ ( i1 i2 -- i3 ) [ / ] interval-op ;

: int/i ( i1 i2 -- i3 ) [ /i ] interval-op ;

: int-intersect ( i1 i2 -- i3/f )
    [ [ first ] 2apply max ] 2keep [ second ] 2apply min
    2dup > [ 2drop f ] [ 2array ] if ;

It is worth noting that the int+ and int- words can be made more efficient:

: int+ ( i1 i2 -- i3 ) v+ ;
: int- ( i1 i2 -- i3 ) reverse v- ;

Monday, April 16, 2007

Oracle binding

Elle Chaftari contributed an Oracle binding, it's in libs/oracle in darcs. Factor now has bindings for SQLite, PostgreSQL, MySQL, ODBC, and Oracle. The next step would be to implement an abstraction layer (like JDBC) and higher-level tools such as prepared statements, connection pooling, and O/R mapping. Anybody wants to take this on? :-)

Embedding Factor into C applications

I implemented a simple way of embedding Factor into C applications and evaluating Factor code from C via a simple C API. This is very preliminary, and needs a lot of work. However it is a good first step.

First, the build process for the VM has changed. It now produces two files, a VM engine library and a VM executable.

On Windows and Mac OS X, the library is built as a shared library, on other Unices it is built as a static library. The reason is that on Linux, there is no way to build an executable which looks for a required shared library in the same directory as the executable itself. The only alternative is to install the shared library in a known location, such as /usr/lib, or to set the LD_LIBRARY_PATH environment variable. This is unacceptable since it complicates matters for people who want to try Factor. It should be possible to just run make then run Factor from the current directory. So, no shared library on Linux.

The VM executable is very small; in fact, it consists of a single source file:

#include "factor.h"

int main(int argc, char **argv)
{
 init_factor_from_args((char*)0,argc,argv,false);
 return 0;
}

The factor.h file defines the exported entry points into the Factor VM library. So far, there are only a small handful of those:

void init_factor_from_args(char *image, int argc, char **argv, bool embedded);
char *factor_eval_string(char *string);
void factor_eval_free(char *result);
void factor_yield(void);

Here is a description of each:

init_factor_from_args() initializes Factor. C applications embedding Factor should always set the embedded flag to true; this causes init_factor_from_args() to return as soon as Factor has been initialized.
factor_eval_string() evaluates a Factor expression and captures any output it performs to a new string. This string is then returned. The expression must not take any inputs from the stack, or leave values on the stack.
factor_eval_free() frees a string returned by factor_eval_string().
factor_yield() yields a time-slice to any Factor threads. This should be called if you evaluate an expression which spawns a thread with in-thread or a similar Factor word.

Here is an example:

#include "factor.h"

int main(int argc, char **argv)
{
 init_factor_from_args(NULL,argc,argv,true);
 char *result = factor_eval_string("2 2 + .");
 printf("%s",result);
 factor_eval_free(result);
 return 0;
}

This API has a number of limitations:

On Unix, Factor takes over a number of signal handlers. Signal handlers suck for this reason -- but Factor really does need to use signals.
Only one Factor instance can exist per C process, and there's no way to de-initialize a Factor instance and free its resources. This will be addressed at some point in the future.
The Factor instance can only be accessed from a single native thread for its entire life time -- this is because the Factor runtime is not thread-safe. This will be addressed in Factor 2.0, which will bring first-class support for native threading.
The default Factor image is quite large (~7mb) and building minimal images involves having a load file. This will be addressed soon; not only for embedding, but also for deployment, it makes sense to be able to build minimal images containing only a certain set of modules.

As you can see, right now this is more of a novelty than a useful feature, but over time I plan on improving this interface and make Factor a viable choice for scripting C applications -- you will be able to build minimal images containing only the code your application needs. Unlike Lua and Python, Factor is natively-compiled, and Factor's FFI for calling back into C is extremely powerful.

In fact, I didn't even plan on working on a C embedding API at this point, however a seemingly unrelated task required it -- Doug is porting Factor to Windows CE, and on Windows CE, .exe files cannot dynamically look up their own symbols. Factor's compiler does this because generated code often has to call into various VM routines -- so we went for the simplest fix, moving the entire VM into a DLL and only leaving a small stub function in the executable. I polished this a bit and made it minimally useful in other contexts, resulting in the the above embedding API.

Saturday, April 14, 2007

Association protocol

Daniel Ehrenberg implemented a generic protocol for associative data structures, much like the sequence protocol we have already. Hashtables, association lists and binary search trees (from libs/) implement this protocol now. Dan wrote an article about porting existing code that uses the old hashtable-specific words (which are gone) to the association protocol.

On a related note, lately more contributors have been venturing into the core, and not just working on stuff in libs/ and apps/. This is a good thing, in my opinion. I don't want the core language to be a mysterious, dark thing, and I'd love it if people learned enough about how the compiler works to contribute new optimizations, etc.

Tuesday, April 10, 2007

New developer tools

Two new features, both snarfed from Symbolics Lisp Machines.

First of all, Factor now remembers word usage information for top-level forms. Consider a word like define-command-map, which (surprisingly) defines an UI command map. It is only ever called from one word, however a ton of source files call it at the top level to define command maps for various gadgets. The usage. now gives us the full picture, instead of one that misleads the developer into thinking this word is rarely used:

\ define-command-map usage.
IN: operations : define-operation-map ( class group blurb object hook translator -- )
P" resource:core/ui/debugger.factor"
P" resource:core/ui/gadgets/editable-slots.factor"
P" resource:core/ui/gadgets/lists.factor"
P" resource:core/ui/text/commands.factor"
P" resource:core/ui/tools/browser.factor"
P" resource:core/ui/tools/help.factor"
P" resource:core/ui/tools/inspector.factor"
P" resource:core/ui/tools/interactor.factor"
P" resource:core/ui/tools/search.factor"
P" resource:core/ui/tools/tiles.factor"
P" resource:core/ui/tools/traceback.factor"
P" resource:core/ui/tools/walker.factor"
P" resource:core/ui/tools/workspace.factor"

The second new tool is the fix word. Suppose you just changed the number, order or types of arguments of a word. Or you renamed a word. You now have to go through each caller of that word and fix it. The fix word helps with that. It opens each usage of a word in your editor, then prompts while you to make any required changes, then goes on to the next usage. Here is an example:

( scratchpad ) \ reverse fix
Fixing usages of reverse...

Editing definition of $command
RETURN moves on to the next usage, C+d stops.

Editing definition of (compute-free-vregs)
RETURN moves on to the next usage, C+d stops.

Editing definition of (flip-branches)
RETURN moves on to the next usage, C+d stops.

Pretty nifty stuff!

I have some in-progress cross-referencing tools I still haven't merged in, for looking at usage relationships between vocabularies. These will become more useful when the new module system is introduced in 0.90, and I will add them to the core. At that point, I want to cook up an UI cross-referencing tool which presents a unified interface to all the little bits and pieces I've been cooking up.

In completely unrelated news, the darcs repository hasn't worked on ARM for a little while. Shortly after the release of 0.88, I made some changes to the register allocator which in turn introduced some bugs which only manifested on the specific register configuration found on the ARM architecture. This is now fixed; Factor now bootstraps and runs on Linux/ARM again. A Windows CE port is still pending.

Tuesday, April 03, 2007

Cat updates

Christopher Diggins is working hard on his Cat language. It is up to version 0.10.3 now, and there is a new Cat Wiki with additional documentation.

Evolving object shapes

In Factor, the fundamental class of user-defined objects is the class of "tuples". A tuple is essentially an array, except elements have names.

In previous Factor releases, redefining the order or number of slots in a tuple class would unintern the word corresponding to the tuple class, and define a new word. This had the effect of introducing a new class into the system; instances of the old class, with the old slot arrangement, and instances of the new class, with the new slot arrangement, were not interchangeable. Any source files defining methods on the tuple class had to be reloaded, to also define those methods on the new class. Old instances would often cause problems when they showed up in calculations. This was all so annoying that when I had to add or remove slots to tuples defined in the core, more often than not I would just bootstrap again to avoid any problems with the old phantom class and its instances hanging around.

This is not an acceptable state of affairs for a language like Factor, which aims to be interactive and robust. Today I implemented a new approach, essentially borrowed from Squeak. When slots are added to a tuple class, existing instances are updated in-place to have the new slot with a value of f; when slots are removed, existing instances are updated and the slot is removed.

The implementation is interesting. First, I added a new low-level become word, which is only intended to be used for the purpose of reshaping tuples, which takes two arrays. The first array is an array of old objects, the second is an array of new objects. This primitive installs forwarding pointers for every old object pointing to the corresponding new object, then runs a GC; this has the side effect of updating all pointers to the old objects to transparently point to the new objects!

On top of this I built the tuple reshaping code. When a tuple class is redefined, it uses the instances word to get all instances of the tuple class, reshapes each one, creating a new tuple in the process, then calls become to perform a bulk update.

Here is a transcript of this in action.

I define a new tuple class with one slot, just a person's name:

( scratchpad ) TUPLE: person name ;

Then I create an instance and inspect it:

( scratchpad ) "Slava" <person>
( scratchpad ) dup describe
an instance of the person class
person-name "Slava"

This instance will remain on the stack the whole time. Now watch as it literally morphs as the class is being redefined.

We add an 'address' slot:

( scratchpad ) TUPLE: person name address ;
*** Data GC (0 minor, 0 cards)
*** Data GC (0 minor, 0 cards)

Note that the reshaping process triggers two garbage collections, so it is not particularly efficient. Obviously you wouldn't do anything stupid like re-define a tuple class inside a tight loop with different slots. Let's take a look at the same old person object we created earlier, that is still on the stack:

( scratchpad ) dup describe
an instance of the person class
person-name    "Slava"
person-address f

A new slot has appeared! Let's give it a value:

( scratchpad ) "10 foo st" over set-person-address
( scratchpad ) dup describe
an instance of the person class
person-name    "Slava"
person-address "10 foo st"

Now let's add an 'id' slot:

( scratchpad ) TUPLE: person id name address ;
*** Data GC (0 minor, 0 cards)
*** Data GC (0 minor, 0 cards)

Note that in our person instance we have sitting around, all slots were shifted down by one to make room for the new slot:

( scratchpad ) dup describe
an instance of the person class
person-id      f
person-name    "Slava"
person-address "10 foo st"

We give it a value:

( scratchpad ) 123 over set-person-id
( scratchpad ) dup describe
an instance of the person class
person-id      123
person-name    "Slava"
person-address "10 foo st"

Now we redefine the class, removing the 'address' slot, and changing the order of the other two slots:

( scratchpad ) TUPLE: person name id ;
*** Data GC (0 minor, 0 cards)
*** Data GC (0 minor, 0 cards)

Looks like everything worked! Keep in mind that this is the same object you're seeing here; it has morphed several times:

( scratchpad ) dup describe
an instance of the person class
person-name "Slava"
person-id   123

This type of feature is very powerful; by eliminating unnecessary restarts, the programmer can very rapidly prototype software, test changes and try new approaches to solving problems.

A commonly-cited benefit of "objects are hashtables" languages such as Ruby is that adding and removing instance variables doesn't disrupt your program's execution or render existing objects invalid. Classically, languages were objects are instead arrays with slots stored at fixed objects have suffered when it comes to on-the-fly object schema evolution. For example, I don't know if Java supports adding instance variables on the fly yet, even in special "debugging" and "hot swap" modes. In contrast, Factor gives you the best of both worlds. Looking up a slot value is an O(1) operation, but classes can still evolve in a very fluid manner. All it takes is some clever garbage collector trickery.

Factor: a practical stack language

Wednesday, April 25, 2007

Is reflective heap access even useful?

Tuesday, April 24, 2007

Cat updates

Formal semantics of concatenative languages

Monday, April 23, 2007

Another gcc bug

Sunday, April 22, 2007

Immutable sequences

Thursday, April 19, 2007

Interval arithmetic

Monday, April 16, 2007

Oracle binding

Embedding Factor into C applications

Saturday, April 14, 2007

Association protocol

Tuesday, April 10, 2007

New developer tools

Tuesday, April 03, 2007

Cat updates

Evolving object shapes

Older Posts

Recent Factor Commits

Planet Factor

Links