Saturday, July 05, 2008

On shaking trees

The application deploy tool was first added a year ago, and I've made steady improvements since then (see 1, 2, 3, 4, 5). The basic idea is that it takes a Factor image, loads your program in, then proceeds to strip out all the code for the parser, compiler, and reflection. After this is done, a GC pass eliminates unreachable code, and the only remaining code in the image is that which is directly referenced by the main word of your program.

About half a year ago, I added some tests which deploy various demos shipped with Factor, then ensure that the resulting binary is below a certain size. These tests always seem to start failing after I make major changes to the Factor core or compiler, because suddenly programs start pulling in code more than they should. It is certainly frustrating when I add a new language feature, debug it, then realize that all the tests pass except the deploy tests pass. Getting them to pass again sometimes involves a fair bit of effort, because I have to think of new tricks to reduce the image size again, and I'm pretty anal about not cheating and just changing the size limits in the tests. This is because it is a very satisfying feeling once I figure out a new trick, and the tests pass again.

For example, I just added optional type declarations to tuple slots, and rewrite a large portion of the low-level tuple class machinery. As usual, the deployed images blew up. My tool for figuring out how to reduce image size is Factor's low-level debugger, which can give a dump of the heap, find references to objects, and dump objects.

I started by moving bit-arrays, float-arrays and the inspector out of the core, so that they won't get loaded by default. This wasn't enough, however. The next step was having the tree shaker prune some word properties which were only used when loading new code, and not at runtime. Then, I noticed that there were a lot of duplicate, immutable byte arrays in the heap. This is the relocation info generated by the compiler to tell the VM about dependencies between code blocks, code GC information, and so on. Turns out that for a lot of different code blocks, the generated relocation info was the same. Replacing duplicate relocation blocks with references to the same object as the last stage in the tree shaker was enough to make the tests pass again. I also applied the same optimization to quotations and strings for good measure. After all these changes, the resulting reduction in deployed image sizes was so drastic that I went ahead and changed the deploy tests to tighten the size bounds even further -- I guess I'm just masochistic like that.

The deployed application sizes are pretty large compared to languages such as C and C++, however they're certainly much smaller than the dynamic language competition. The Tetris demo tree-shakes down to 1Mb; a console "hello world" is 340Kb; a GUI "hello world" is also around 1Mb; and the "bunny" demo, which downloads a 3d model via HTTP and renders it with OpenGL, tree-shakes down to 2Mb. In addition to the tree-shaken image, the only other dependency is the Factor VM (<200Kb), and on Mac OS X and Windows, the FreeType library (this dependency will go away at some point in the future). Furthermore, you get a nice double-clickable .app on Mac OS X so that the user is not even aware they're running something written in Factor; it feels like any other native executable.

I believe the ability to deploy small, stand-alone binaries which do not contain any source code and do not depend on an external runtime environment is an essential capability for a general-purpose language. This is also something that many other language environments do poorly; Java requires you to bundle a large (at least 8mb) runtime environment, Python and Ruby have no way of distributing code without the source, and the free Lisp environments can usually save customized images but don't attempt to strip them down in any way, leaving you with a 25Mb "Hello World".

Factor has a lot of great features: an interactive development environment, stand-alone application deployment, the web framework, SSL, database libraries, Unicode 5.1, XML, etc, not to mention automated binary builds with continuous testing on 11 platforms. We've come a long way in less than 5 years, and I believe this is all thanks to the incredible productivity offered by the Factor language itself. This pace of development is not going to let up any time soon, and all the people who ridiculed "yet another programming language" in Factor's early days are only going to look more and more foolish as time goes on.

6 comments:

Anonymous said...

I couldn't find an email address in the factor website, so I'm posting two broken links that I found here.
Feel free to remove this comment.
Thanks for the great work in Factor.

Broken links:

on: http://factorcode.org/getstarted.fhtml
<a href="/responder/help">browsed online</a>

on: http://factorcode.org/faq.fhtml
<a href="http://factorcode.org/development.fhtml">Factor website</a>

Samuel A. Falvo II said...

Actually, Python will let you distribute .pyc files, which are the byte-code, pre-compiled files corresponding to .py files.

Anonymous said...

As far as I know the .pyc are mostly for speed reasons, you can disassemble them easily, or?

Anonymous said...

http://nedbatchelder.com/blog/200804/the_structure_of_pyc_files.html

pyc is just a marshalled code object.

Berlin Brown Discussions said...

There is also the egg file for python, but Slava makes a point. Most people run python or ruby applications with the source.

When is the last time "most" people launch a python application through the egg system. Or only

I just want to get the application (as one file) and then run it. I remember a long time ago; you entered a terminal and you typed a command to run your application. Now, you have all of these dependencies and forked/patched code. Most users (and system maintainers) don't have a clue at what is going on.

I tried to create a minimalist python, a small download that only included the python executable and the particular application. Short and sweet. The community thought the concept was crazy.

Anyway. Keep up the good work. I still have a project I want to use factor for. Still haven't gotten to it.

Anonymous said...

32-bit Athlon XP processors do not support SSE2 and thus Linux and Windows users who have that CPU model should build Factor from source instead. Why?

Cheers,
Genevieve Clark